Importing crystal structure predictions into a database

The script, utilities/csp/import_csp_landscape/import_csp_landscape.py may be used to import crystal structure prediction CIF files into a crystal structure prediction database.

Crystal structure predictions are imported from CIF files. These CIF files should be annotated with CSP metadata, such as _ccdc_csp_classification_energy_lattice_relative. There is expected to be one prediction per CIF file.

Predictions for a molecule are collected together in a “landscape” of predictions. The landscape name is an extra identifier that can be supplied when predictions are imported to the database. If the landscape name is not supplied, it is automatically generated from the directory name in which CSP CIF files are found for import.

import_csp_landscape.py - import CSP CIF files into a CSP landscape database

A CSP landscape database combines two parts (see ccdc.csp.database):

  1. A crystal structure database, .csdsqlx format, identified by filename.

  2. A CSP metadata database, identified by webservice URL.

This script will take data from CIF files and add it to both parts of a CSP landscape database.

optional arguments:

–landscape_name The name of the CSP landscape.

positional arguments:

landscape_database: The URL of a landscape database server.

structure_database: A Crystal structure database (.csdsqlx).

cif_files: CIF files for structures to be imported to the landscape and structure databases.

usage examples:

python import_csp_landscape.py –landscape_name=wonder_drug http://my.landscape.server:12345 wonder_drug.csdsqlx wonder_drug/*.cif

This will import all CIF files in a folder called “wonder_drug” into a CSP metadata database and crystal structure database. The landscape name will be “wonder_drug”.

The structural CSP database, csdsqlx file format database, uses fast binary 3.0 schema, and it will needs to be converted to scheme 1.3 to be edited in CSD Editor. For 3.0 to 1.3 scheme conversion use the ccdc_babel command line program that comes with the Decifer installer.

The basic syntax will be:

ccdc_babel.exe -csdsqlx3 <filepath to 3.0 shema CSP database>my_csp_database.csdsqlx -csdsqlx <filepath ‘CSD Editor’ compatible CSP database>my_csd_editor_database.csdsqlx

Further instructions on how to convert between csdsqlx and csdsql 3.0 formats can be found in the Decifer docs (2.3.6 section).