RMechDB overview
A free radical is a chemical compound with at least one half occupied orbital. The presence of the half occupied orbitals makes a radical compound highly reactive. Therefore, free radicals have the potential to both serve as powerful chemical tools and be extremely harmful contaminants. Chemical reactions involving a free radical are an essential part of synthetic, biochemical, atmospheric, and plasma chemistry. For instance, the climate crisis has dramatically altered fire activity worldwide. Wild land fires are increasing in frequency, duration, intensity, and size. The chemistry of flames is dominated by radical reactions and the chemical composition of fire smoke changes during atmospheric transport. This so-called “aging” of smoke is poorly understood, but known to be largely driven by free radical processes
RMechDB is a live platform for aggregating, curating, and distributing chemcial reactions in the form of elementary radical steps to accelerate research in chemoinformatics and radical reaction modeling. The RMechDB platform is designed to facilitate training deep learning and other AI models in data-driven workflows using its tabular data, with no need for additional pre-processing steps It provides a unified model that ought to facilitate data sharing, model building, dissemination, and publications. We encourage the community to explore and use the RMechDB data and functionalities, and contribute to its expansion.
Click here to read the RMechDB paper.
How to use RMechDB
Search by reaction
The search by reaction interface offers fours search types:
- Exact search
- Reactant search
- Product search
- Similarity search
To perform an exact search, the search query must be a complete radical reaction step including both reactants and products. The result of this search is a list of all the radical reaction steps containing both the query reactants and the query products.
Input examples:
- C1=CC=C(C=C1)CC[O:10][N+:11](=O)[O-]>>c1ccc(cc1)CC[O].[N+](=O)[O-]
- [CH2:20]C1=CC=CC=C1.[N+:10](=O)[O-]>>c1ccc(cc1)C[N+](=O)[O-]
To perform a reactant search or a product search, the search query must be a set of molecules separated by ".". The result of a reactant (resp. product) search is a list of all the radical reaction steps whose reactants contain the reactants (resp. products) query.
Input examples:
- C[C](C)C.[N+](=O)[O-]
- C[C](C)C.C=O
To perform a similarity search, the input query must be a complete radical reaction step including both reactants and products. The user must choose a similarity metric (e.g Tanimoto). Different similarity metrics use different reaction representations. The results of this search is the list of all the radical reaction steps in the RMechDB database sorted by their similarty to the query reaction step.
Input example:
- CC(C)C.[N+](=O)[O-]>>CC(C)(C)[N+](=O)[O-]
- [CH2:20]C1=CC=CC=C1.[N+:10](=O)[O-]>>c1ccc(cc1)C[N+](=O)[O-]
*** For all the aforementioend search types, the order of the molecules in the reactants or the products does not affect the search results.
*** For all the aforementioend search types, the numbering of the atoms does not affect the search results.
Search by compound
The search by compound interface offers three different search types using three entities:
- Molecule
- Reactive atom
- Substructure
To perform a molecule search, the input query must be a valid molecule SMILES string. The results of this search is a list of all the radical reaction steps containing the input molecule on either side of the reaction. Please note that the numbering of different atoms within the input SMILES string does not affect the search results.
Input examples:
- CC(C)(C)CO[N+](=O)[O-]
- CC(C)(C)[N+](=O)[O-]
To perform a reactive atom search, the search query must be a valid SMILES string of a molecule with the reactive atom labeled with an integer. The integer used for labeling the atom must be between 1 and 10 and only one atom within the molecule must be labeled.
Input examples:
- C[C:9](C)C
- CC(C)(C)[CH2:7][O]
Please note that according to the RMechDB model of radical elementary step, there are only two reactive atoms in a radical elementary step. The results of the reactive atom search is a list of all the radical reaction steps whose one of the two reactive atoms matches the input reactive atom. To learn model you can read the Standard Elementary Step Model section of the RMechDB paper.
To perform a substructure search, the input query must be a valid SMARTS of a chemical substructure. The results of this search is a list of all the radical reaction steps with at least one molecule on either side of the reaction that contains the input substructure.
Input examples:
- [CX4]
- C1CCCCC1
To download the RMechDB data set, the user must enter his information and email address. Upon reading and agreeing to the terms of the CC-BY-NC-ND license, the user will receive an email containing the RMechDB data set. The RMechDB data set is a directory with five comma separated value (csv) files:
- all.csv: All the radical elementary steps in the RMechDB data set.
- train_core.csv: All the core radical elementary steps chosen for training machine learning models.
- train_specific.csv: All the specific radical elementary steps chosen for training machine learning models.
- test_core.csv: All the core radical elementary steps chosen for testing machine learning models.
- test_scpecific.csv: All the specific radical elementary steps chosen for testing machine learning models.
Each of the csv files has five columns:
- The SMIKRS of elementary steps.
- The arrow codes of elementary steps.
- The intial condition of elementary steps.
- The category of elementary steps based on the three class classification explained in the Composition of the RMechDB Data Set section of the RMechDB paper.
- The category of the elementary step based on the seven class classification explained in the Composition of the RMechDB Data Set section of the RMechDB paper.
Upload multiple reaction steps
You must create a comma separated value (.csv) file containing all the steps. Each row of the file must represent a reaction with four columns:
(1) Reaction SMIRKS(2) Arrow codes
(3) Original source
(4) Auxilary information (e.g. initial energy, special condition such as low pressure)
Here is a file sample that must be uploaded to the RMechDB:
If you could not upload the file you prepared, or the reaction data you collected does not fit the format mentioned above,
You can send the file via email to:
- Amin Tavakoli , Email: mohamadt [at] uci [dot] edu.
- Pierre Baldi , Email: pfbaldi [at] uci [dot] edu.
The subject of the email must read Upload Reactions to RMechDB.
It would be helpful that within your email, you include the reason(s) why you did not upload the file.
After receveing your email, we perform automatic and manual examination of the reactions. Then, you will receive and email with information on weather the reaction was inserted into the RMechDB database or not.
You can contact us with your questions.
Read this documentation on how to use Ketcher to draw molecules and reactions.