OpenAD Commands

This is the full list of available commands.

IMPORTANT

When running commands from Jupyter, prepend them with %openad


Table of Contents



Main Commands

Macromolecules

show mmol|protein <fasta> | '<pdb_id>'

Launch the molecule viewer to visualize your macromolecule and inspect its properties.

Examples:

  • Show a protein by its PDBe ID:
    show mmol '2g64'

  • Show a protein by its FASTA string:
    show protein MAKWVCKICGYIYDEDAGDPDNGISPGTKFEELPDDWVCPICGAPKSEFEKLED

General

openad

Display the openad splash screen.

get status

Display the currently selected workspace and toolkit.

display history

Display the last 30 commands run in your current workspace.

clear sessions

Clear any other sessions that may be running.

Workspaces

set workspace <workspace_name>

Change the current workspace.

get workspace [ <workspace_name> ]

Display details a workspace. When no workspace name is passed, details of your current workspace are displayed.

create workspace <workspace_name> [ description('<description>') on path '<path>' ]

Create a new workspace with an optional description and path.

remove workspace <workspace_name>

Remove a workspace from your registry. Note that this doesn’t remove the workspace’s directory.

list workspaces

Lists all your workspaces.

Toolkits

ds4sd

Display the splash screen for the DS4SD toolkit.

rxn

Display the splash screen for the RXN toolkit.

list toolkits

List all installed toolkits.

list all toolkits

List all available toolkits.

add toolkit <toolkit_name>

Install a toolkit.

remove toolkit <toolkit_name>

Remove a toolkit from the registry.

Note: This doesn’t delete the toolkit code. If the toolkit is added again, a backup of the previous install is created in the toolkit directory at ~/.openad/toolkits.

update toolkit <toolkit_name>

Update a toolkit with the latest version. It is recommended to do this on a regular basis.

update all toolkits

Update all installed toolkits with the latest version. Happens automatically whenever OpenAD is updated to a new version.

set context <toolkit_name> [ reset ]

Set your context to the chosen toolkit. By setting the context, the selected toolkit functions become available to you. The optional parameter reset can be used to reset your login information.

get context

Display the currently selected toolkit.

unset context

Exit your toolkit context. You will no longer have access to toolkit-specific functions.

Runs

create run

Start recording a run.

remove run <run_name>

remove a run.

save run as <run_name>

Stop recording a run and save it.

run <run_name>

Execute a previously recorded run. This will execute every command and continue regardless of any failures.

list runs

List all runs saved in the current workspace.

display run <run_name>

Display the commands stored in a certain run.

Utility

display data '<filename.csv>'

Display data from a csv file.

result save [as '<filename.csv>']

Save table data to csv file.

result open

Explore table data in the browser.
if you append -d to the end of the command result open -d display will result to data viewer.

result edit

Edit table data in the browser.
if you append -d to the end of the command result open -d display will result to data viewer.

result copy

Copy table data to clipboard, formatted for spreadheet.

result display

Display the result in the CLI.

    if you append `-d` to the end of the command `result open -d` display will result to data viewer. <br>

result as dataframe

Return the result as dataframe (only for Jupyter Notebook)

edit config '<json_config_file>' [ schema '<schema_file>']

Edit any JSON file in your workspace directly from the CLI. If a schema is specified, it will be used for validation and documentation.

GUI

install gui

Install the OpenAD GUI (graphical user interface).

The graphical user interface allows you to browse your workspace and visualize your datasets and molecules.

launch gui

Launch the OpenAD GUI (graphical user interface).

restart gui

Terminate and then restart the GUI server.

quit gui

Terminate the GUI server.

LLM

tell me <how to do xyz>

Ask your AI assistant how to do anything in OpenAD.

set llm <language_model_name>

Set the target language model name for the tell me command.

clear llm auth

Clear the language model’s authentication file.

File System

list files [ path ]

List al directories and files in your current workspace.

import from '<external_source_file>' to '<workspace_file>'

Import a file from outside OpenAD into your current workspace.

export from '<workspace_file>' to '<external_file>'

Export a file from your current workspace to anywhere on your hard drive.

copy file '<workspace_file>' to '<other_workspace_name>'

Export a file from your current workspace to another workspace.

remove '<filename>'

Remove a file from your current workspace.

open '<filename>'

Open a file or dataframe in an iframe

Examples:

  • open 'base_molecules.sdf'
  • open my_dataframe

Help

intro

Display an introduction to the OpenAD CLI.

docs

Open the documentation webpage.

?

List all available commands.

? ...<soft> --> List all commands containing "..."</soft>

... ?<soft> --> List all commands starting with "..."</soft>

Model

model auth list

show authentication group mapping

model auth add group '<auth_group>'|<auth_group> with '<api_key>'

add an authentication group for model services to use

model auth remove group '<auth_group>' | <auth_group>

remove an authentication group

model auth add service '<service_name>'|,service_name> to group '<auth_group>'|<auth_group>

Attach an authentication group to a model service

model auth remove service '<service_name>'|<service_name>

Detatch an authentication group from a model service

model service status

Get the status of currently cataloged services

model service describe '<service_name>'|<service_name>

get the configuration of a service

model catalog list

get the list of currently cataloged services

uncatalog model service '<service_name>'|<service_name>

uncatalog a model service

Example:
uncatalog model service 'gen'

catalog model service from (remote) '<path> or <github> or <service_url>' as '<service_name>'|<service_name> USING (<parameter>=<value> <parameter>=<value>)

catalog a model service from a path or github or remotely from an existing OpenAD service.
(USING) optional headers parameters for communication with service backend.
If you are cataloging a service using a model defined in a directory, provide the absolute ` ` of that directory in quotes.

The following options require the remote option be declared.

If you are cataloging a service using a model defined in github repository, provide the absolute ` ` of that github directory quotes.

If you are cataloging a remote service on a ip address and port provide the remote services ipaddress and port in quoted string e.g. '0.0.0.0:8080'

service_name: this is the name of the service as you will define it for your usage. e.g prop short for properties.

USING Parameters:

If using a hosted service the following parameters must be supplied:
-Inference-Service: this is the name of the inference service that is hosted, it is a required parameter if cataloging a remote service.
An authorization parameter is always required if cataloging a hosted service, either Auhtorisation group (auth_group) or Authorisation bearer_token/api_key (Authorization):
-auth_group: this is the name of an authorization group which contains the api_key linked to the service access. This can only be used if Authorization is not also defined.
OR
-Authorization: this parameter is designed to be used when a auth_group is not defined.

Example:

Skypilot Deployment
-catalog model service from 'git@github.com:acceleratedscience/generation_inference_service.git' as 'gen'

Service using a authentication group
-catalog model service from remote '<service_url>' as molf USING (Inference-Service=molformer )
` model auth add service ‘molf’ to group ‘default’`

Single Authorisation Service
-openad catalog model service from remote '<service_URL>' as 'gen' USING (Inference-Service=generation Authorization='<api_key>')

Catalog a remote service shared with you:
-catalog model service from remote 'http://54.235.3.243:30001' as gen

model service up '<service_name>'|<service_name> [no_gpu]}

launches a cataloged model service when it was cataloged as a self managed service from a directory or github repository.
If you do not want to launch a service with GPU you should specify no_gpu at the end of the command.
Examples:

-model service up gen

-model service up 'gen'

-model service up gen no_gpu

model service local up '<service_name>'|<service_name>

Launches a model service locally.

        Example: <br> 
          ` model service local up gen` <br>

model service down '<service_name>'|<service_name>

Bring down a model service
Examples:

model service down gen

model service down 'gen'

get model service '<service_name>'|<service_name> result '<result_id>'

retrieves a result from a model service
Examples:

get model service myservier result 'wergergerg'
</details>

## DS4SD ### Search Molecules

search for similar molecules to '<smiles>' [ save as '<filename.csv>' ]

Search for molecules that are similar to the provided molecule or molecule substructure as provided in the <smiles_string>.

Use the save as clause to save the results as a csv file in your current workspace.

Example:
search for similar molecules to 'C1(C(=C)C([O-])C1C)=O'

search for molecules in patents from list ['<patent1>', '<patent2>', ...] | dataframe <dataframe_name> | file '<filename.csv>' [ save as '<filename.csv>' ]

Search for molecules mentioned in a defined list of patents. When sourcing patents from a CSV or DataFrame, there must be column named “PATENT ID” or “patent id”.

Use the save as clause to save the results as a csv file in your current workspace.

Example:
search for molecules in patents from list ['CN108473493B','US20190023713A1']

search for patents containing molecule '<smiles>' | '<inchi>' | '<inchikey>' [ save as '<filename.csv>' ]

Search for mentions of a specified molecules in registered patents. The queried molecule can be described as a SMILES string, InChI or InChiKey.

Use the save as clause to save the results as a csv file in your current workspace.

Example:
search for patents containing molecule 'CC(C)(c1ccccn1)C(CC(=O)O)Nc1nc(-c2c[nH]c3ncc(Cl)cc23)c(C#N)cc1F'

search for substructure instances of '<smiles>' [ save as '<filename.csv>' ]

Search for molecules by substructure, as defined by the <smiles_string>.

Use the save as clause to save the results as a csv file in your current workspace.

Example:
search for substructure instances of 'C1(C(=C)C([O-])C1C)=O' save as 'my_mol'

### Search Collections

search collection '<collection_name_or_key>' for '<search_string>' [ using (page_size=<int> system_id=<system_id> edit_distance=<integer> display_first=<integer>) ] show (data | docs) [ estimate only | return as data | save as '<filename.csv>' ]

Performs a document search of the Deep Search repository based on a given collection. The required using clause specifies the collection to search. Use estimate only to return only the potential number of hits.

Parameters:

  • <collection_name_or_key> The name or index key for a collection. Use the command display all collections to list available collections.
  • <search_string> The search string for the search.

The <search_string> supports elastic search string query syntax:

  • + Signifies AND operation.
  • | Signifies OR operation.
  • - Negates a single token.
  • \" Wraps a number of tokens to signify a phrase for searching.
  • * At the end of a term -> signifies a prefix query
  • ( & ) Signifies precedence
  • ~N After a word -> signifies edit distance (fuzziness)
  • ~N After a phrase -> signifies slop amount

Options for the using clause:

Note: The using clause requires all enclosed parameters to be defined in the same order as listed below.

  • page_size=<integer> Result pagination, the default is None.
  • system_id=<system_id> System cluster id, the default is ‘default’.
  • edit_distance=<integer> (0-5) Sets the search word span criteria for key words for document searches, the default is 5. When set to 0, no snippets will be be returned.
  • display_first=<integer> When set, the displayed result set will be truncated at the given number.

Clauses:

  • show (data | docs):
    • data Display structured data from within the documents.
    • docs Display document context and preview snippet.
      Both can be combined in a single command, e.g. show (data docs)
  • estimate only Determine the potential number of hits.
  • return as data For Notebook or API mode. Removes all styling from the Pandas DataFrame, ready for further processing.

Examples:

  • Look for documents that contain discussions on power conversion efficiency:
    search collection 'arxiv-abstract' for 'ide(\"power conversion efficiency\" OR PCE) AND organ*' using ( edit_distance=20 system_id=default) show (docs)

  • Search the PubChem archive for ‘Ibuprofen’ and display related molecules’ data:
    search collection 'pubchem' for 'Ibuprofen' show (data)

  • Search for patents which mention a specific smiles molecule:
    search collection 'patent-uspto' for '\"smiles#ccc(coc(=o)cs)(c(=o)c(=o)cs)c(=o)c(=o)cs\"' show (data)

display collection matches for '<search_string>' [ save as '<filename.csv>' ]

Search all collections for documents that contain a given Deep Search <search_string>. This is useful when narrowing down document collection(s) for subsequent search. You can use the <index_key> from the returned table in your next search.

Use the save as clause to save the results as a csv file in your current workspace.

Example:
display collection matches for 'Ibuprofen'

### Collections

display collections in domains from list <list_of_domains> [ save as '<filename.csv>' ]

Display collections that belong to the listed domains.

Use the save as clause to save the results as a csv file in your current workspace.

Use the command display all collections to find available domains.

Example:
display collections in domains from list ['Scientific Literature']

display all collections [ save as '<filename.csv>' ]

Display all available collections in Deep Search.

Use the save as clause to save the results as a csv file in your current workspace.

display collections for domain '<domain_name>'

Display the available collections in a given Deep Search domain.

Use the command display all collections to find available domains.

Example:
display collections for domain 'Business Insights'

display collection details '<collection_name_or_key>'

Display the details for a specified collection. You can specify a collection by its name or key.

Use the command display all collections to list available collections.

Example:
display collection details 'Patents from USPTO'



## RXN ### General

interpret recipe '<recipe_paragraph>' | '<txt_filename>'

Build a ordered list of actions interpreted from a provided text-based recipe. The recipe can be provided as a string or as a text file from your current workspace.

Examples:

  • interpret recipe 'my_recipe.txt'
  • interpret recipe 'A solution of ((1S,2S)-1-{[(methoxymethyl-biphenyl-4-yl)-(2-pyridin-2-yl-cyclopropanecarbonyl)-amino]-methyl}-2-methyl-butyl)-carbamic acid tert-butyl ester (25 mg, 0.045 mmol) and dichloromethane (4 mL) was treated with a solution of HCl in dioxane (4 N, 0.5 mL) and the resulting reaction mixture was maintained at room temperature for 12 h. The reaction was then concentrated to dryness to afford (1R,2R)-2-pyridin-2-yl-cyclopropanecarboxylic acid ((2S,3S)-2-amino-3-methylpentyl)-(methoxymethyl-biphenyl-4-yl)-amide (18 mg, 95% yield) as a white solid.'

list rxn models

Lists all RXN AI models currently available.

### Retrosynthesis

predict retrosynthesis '<smiles>' [ using (<parameter>=<value> <parameter>=<value>) ]

Perform a retrosynthesis route prediction on a molecule.

RXN was trained on more than 3 million chemical reactions, derived from publicly available patents. Since then, the Molecular Transformer has outperformed all data-driven models, achieving more than 90% accuracy on forward chemical reaction predictions (reactants + reagents to products)

Note: The using clause requires all enclosed parameters to be defined in the same order as listed below.

Optional Parameters that can be specified in the using clause:

  • availability_pricing_threshold=<int> Maximum price in USD per g/ml of compounds. Default: no threshold.
  • available_smiles='<smiles>.<smiles>.<smiles>' List of molecules available as precursors, delimited with a period.
  • exclude_smiles='<smiles>.<smiles>.<smiles>' List of molecules to exclude from the set of precursors, delimited with a period.
  • exclude_substructures='<smiles>.<smiles>.<smiles>' List of substructures to exclude from the set of precursors, delimited with a period.
  • exclude_target_molecule=<boolean> Excluded target molecule. The default is True
  • fap=<float> Every retrosynthetic step is evaluated with the FAP, and is only retained when forward confidence is greater than the FAP value. The default is 0.6.
  • max_steps=<int> The maximum number steps in the results. The default is 3.
  • nbeams=<int> The maximum number of beams exploring the hypertree. The default is 10.
  • pruning_steps=<int> The number of steps to prune a hypertree. The default is 2.
  • ai_model='<model_name>' What model to use. Use the command list rxn models to list all available models. The default is ‘2020-07-01’.

There are different models available for use with this command including: ‘12class-tokens-2021-05-14’, ‘2019-09-12’, ‘2020-04-24’, ‘2020-07-01’, ‘2020-07-31’, ‘aizynth-2020-06-11’, ‘disconnection-aware-2022-06-24’, ‘enzymatic-2021-04-16’, ‘enzymatic-2022-05-31’, ‘sulfonium-2020-10-27’

Examples:
predict retrosynthesis 'BrCCc1cccc2c(Br)c3ccccc3cc12' using (max_steps=3)

predict retrosynthesis 'BrCCc1cccc2c(Br)c3ccccc3cc12' using (max_steps=6 ai_model='12class-tokens-2021-05-14' )

### Prediction

predict reaction in batch from dataframe <dataframe_name> | file '<filename.csv>' | list ['<smiles>.<smiles>','<smiles>.<smiles>'] [ using (ai_model='<ai_model>') ] [ use_saved ]

Run a batch of reaction predictions. The provided list of reactions can be specified as a DataFrame, a CSV file from your current workspace or a list of strings. When proving a DataFrame or CSV file, we will look for the “reactions” column.

Reactions are defined by combining two SMILES strings delimited by a period. For example: 'BrBr.c1ccc2cc3ccccc3cc2c1'

Optional Parameters that can be specified in the using clause:

  • ai_model='<model_name>' What model to use. Use the command list rxn models to list all available models. The default is ‘2020-07-01’.

You can reuse previously generated results by appending the optional use_saved clause. This will reuse the results of a previously run command with the same parameters, if available.

Examples:

  • predict reaction in batch from list ['BrBr.c1ccc2cc3ccccc3cc2c1CCO' , 'BrBr.c1ccc2cc3ccccc3cc2c1']
  • predict reaction in batch from list ['BrBr.c1ccc2cc3ccccc3cc2c1CCO' , 'BrBr.c1ccc2cc3ccccc3cc2c1'] use_saved

predict reaction '<smiles>.<smiles>' [ using (ai_model='<ai_model>') ] [ use_saved ]

Predict the reaction between two molecules.

Reactions are defined by combining two SMILES strings delimited by a period. For example: 'BrBr.c1ccc2cc3ccccc3cc2c1'

Optional Parameters that can be specified in the using clause:

  • ai_model='<model_name>' What model to use. Use the command list rxn models to list all available models. The default is ‘2020-07-01’.

You can reuse previously generated results by appending the optional use_saved clause. This will reuse the results of a previously run command with the same parameters, if available.

Examples:

  • predict reaction 'BrBr.c1ccc2cc3ccccc3cc2c1CCO'
  • predict reaction 'BrBr.c1ccc2cc3ccccc3cc2c1CCO' use_saved

predict reaction topn in batch from dataframe <dataframe_name> | file '<filename.csv>' | list ['<smiles>.<smiles>','<smiles>.<smiles>'] [ using (topn=<integer> ai_model='<ai_model>') ] [ use_saved ]

Run a batch of reaction predictions for topn. The provided list of reactions can be specified as a DataFrame, a CSV file from your current workspace or a list of strings. When proving a DataFrame or CSV file, we will look for the “reactions” column.

Reactions are defined by combining two SMILES strings delimited by a period. For example: 'BrBr.c1ccc2cc3ccccc3cc2c1'

Optional Parameters that can be specified in the using clause:

  • ai_model='<model_name>' What model to use. Use the command list rxn models to list all available models. The default is ‘2020-07-01’.
  • topn=<integer> Defined the number of results being returned. The default value is 3.

You can reuse previously generated results by appending the optional use_saved clause. This will reuse the results of a previously run command with the same parameters, if available.

Examples:

  • predict reaction topn in batch from list ['BrBr.c1ccc2cc3ccccc3cc2c1CCO' , 'BrBr.c1ccc2cc3ccccc3cc2c1']
  • predict reaction topn in batch from list ['BrBr.c1ccc2cc3ccccc3cc2c1CCO' , 'BrBr.c1ccc2cc3ccccc3cc2c1'] using (topn=6)
  • predict reaction topn in batch from list ['BrBr.c1ccc2cc3ccccc3cc2c1CCO' , 'BrBr.c1ccc2cc3ccccc3cc2c1'] use_saved