Search philosophyΒΆ

The ccdc.search module supports four types of search:

The basic philosophy used to set up and run searches is to:

  1. Create a search object

  2. Use the ccdc.search.Search.search() function of the search object to search a specific database

For substructure and similarity searches the database to be searched can be:

  • the CSD

  • a (multi) molecule file path

  • a ccdc.io reader (this can be one or more databases)

  • a Python list of identifiers

  • an individual molecule

  • an individual crystal

Because text numeric searches are carried out on fields specific to the CSD these searches can only be performed on the CSD.

The ccdc.search.Search.search() function will return a list of ccdc.search.Search.SearchHit instances. In some cases these have been specialised for the specific type of search performed:

All hit classes contain an identifier as well as attributes to access ccdc.entry.Entry, ccdc.crystal.Crystal and ccdc.molecule.Molecule.

The ccdc.search.SimilaritySearch.SimilarityHit additionally contains a similarity attribute.

The ccdc.search.SubstructureSearch.SubstructureHit has two additional functions:

  • ccdc.search.SubstructureSearch.SubstructureHit.match_atoms()

  • ccdc.search.SubstructureSearch.SubstructureHit.match_components()

Furthermore the ccdc.search.SubstructureSearch.SubstructureHitList has an additional function for superimposing the hits on the first ccdc.search.SubstructureSearch.SubstructureHit in the ccdc.search.SubstructureSearch.SubstructureHitList:

A ccdc.search.SubstructureSearch.SubstructureHit may also contain the attributes measurements and constraints if any geometric measurements/constraints have been added to the ccdc.search.SubstructureSearch. These are dictionaries keyed by the name of the measurement or constraint defined on the ccdc.search.SubstructureSearch. For more information on measurements and constraints see Substructure searching with geometric measurements and Substructure searching with geometric constraints.