Search API¶
Introduction¶
The ccdc.search
module provides various search classes.
The main classes of the ccdc.search
module are:
These all inherit from the base class ccdc.search.Search
. The base
ccdc.search.Search
contains nested classes defining basic search hits
and settings:
The base class ccdc.search.Search
also contains the
ccdc.search.Search.search()
function which is used to search the CSD.
All the searches except ccdc.search.TextNumericSearch
also support
searching of the following additional data sources:
a Python list of identifiers
a molecule file path
a
ccdc.io
readeran individual
ccdc.molecule.Molecule
an individual
ccdc.crystal.Crystal
a list of molecules, crystals or entries
The ccdc.search.TextNumericSearch
can only sensibly be applied to
a crystal structure database, which is the CSD by default or a ccdc.io.EntryReader
opened on a database file.
The ccdc.search.Search.search()
returns a list of
ccdc.search.Search.SearchHit
instances. Some of the searches make use
of more specific search hit classes, namely:
Most of the searches return simple Python lists of search hits. However,
a search carried out using a ccdc.search.SubstructureSearch
returns a
ccdc.search.SubstructureSearch.SubstructureHitList
, which contains a
ccdc.search.SubstructureSearch.SubstructureHitList.superimpose()
function for superimposing
all the hits on the first instance in the list.
To illustrate some of the searches let us first get an aspirin molecule.
>>> from ccdc.io import EntryReader
>>> csd_reader = EntryReader('CSD')
>>> mol = csd_reader.molecule('ACSALA')
Text numeric searching.
>>> from ccdc.search import TextNumericSearch
>>> text_numeric_search = TextNumericSearch()
>>> text_numeric_search.add_compound_name('aspirin')
>>> hits = text_numeric_search.search()
>>> len(hits)
101
Substructure searching.
>>> from ccdc.search import MoleculeSubstructure, SubstructureSearch
>>> substructure = MoleculeSubstructure(mol)
>>> substructure_search = SubstructureSearch()
>>> _ = substructure_search.add_substructure(substructure)
>>> hits = substructure_search.search()
>>> len(hits)
65
Similarity searching.
>>> from ccdc.search import SimilaritySearch
>>> similarity_search = SimilaritySearch(mol)
>>> hits = similarity_search.search()
>>> len(hits)
111
Reduced cell searching.
>>> from ccdc.search import ReducedCellSearch
>>> crystal = csd_reader.crystal('ACSALA')
>>> query = ReducedCellSearch.CrystalQuery(crystal)
>>> reduced_cell_searcher = ReducedCellSearch(query)
>>> hits = reduced_cell_searcher.search()
>>> len(hits)
16
Combined searches.
>>> from ccdc.search import CombinedSearch
>>> combined_search = CombinedSearch(similarity_search & -text_numeric_search)
>>> hits = combined_search.search()
>>> len(hits)
32
See also
The descriptive documentation for the general philosophy of searching, substructure searching, similarity searching, text numeric searching, and reduced cell searching. combined searches.
API¶
Classes for defining substructures¶
- class ccdc.search.QueryAtom(atomic_symbol='', _substructure_atom=None)[source]¶
Atom used to define a substructure search.
A QueryAtom can be used to represent a single atom type or a set of atom types. A QueryAtom can also have additional constraints imposed on it, for example that it should be aromatic.
Let us create a query atom representing an oxygen atom.
>>> query_atom = QueryAtom('O') >>> print(query_atom) QueryAtom(O)
Suppose that we wanted the query atom to be either a carbon or a nitrogen atom.
>>> query_atom = QueryAtom(['C', 'N']) >>> print(query_atom) QueryAtom(C, N)
It is possible to add further constraints on a QueryAtom. For, example, we can insist that it should be aromatic.
>>> query_atom.aromatic = True >>> print(query_atom.aromatic) AtomAromaticConstraint: 1 >>> print(query_atom) QueryAtom(C, N)[atom aromaticity: equal to 1]
See Query Atoms for further details.
- property acceptor¶
Constraint specifying whether or not the QueryAtom is an acceptor.
>>> a = QueryAtom(['C', 'N']) >>> a.acceptor = True >>> print(a) QueryAtom(C, N)[AtomAcceptorTypeConstraint]
- add_connected_element_count(atomic_symbols, count)[source]¶
Set the number of connected elements constraint.
Constraint to define the number of times the QueryAtom should be connected to atoms with elements defined in the atomic_symbols list.
- Parameters
atomic_symbols – atomic symbol or list of atomic symbols.
count – see Constraint conditions for details.
>>> a = QueryAtom(['C', 'N']) >>> a.add_connected_element_count(['F', 'Cl'], 2) >>> print(a) QueryAtom(C, N)[count connected elements equal to 2 from [F,Cl]]
- add_protein_atom_type_constraint(*types)[source]¶
Add a constraint that an atom be in one of the protein atom types.
This is of use only when searching a protein structure.
- Parameters
*types – one or more of ‘AMINO_ACID’, ‘LIGAND’, ‘COFACTOR’, ‘WATER’, ‘METAL’, ‘NUCLEOTIDE’, ‘UNKNOWN’. Any case-insensitive, unique prefix may be used.
>>> a = QueryAtom('Zn') >>> a.add_protein_atom_type_constraint('Ligand', 'Metal') >>> print(a) QueryAtom(Zn)[protein substructure type : one of 1, 3]
- property aromatic¶
Constraint specifying whether or not the QueryAtom is aromatic.
>>> a = QueryAtom(['C', 'N']) >>> a.aromatic = True >>> print(a) QueryAtom(C, N)[atom aromaticity: equal to 1]
- property chirality¶
Constraint specifying the chirality around an atom.
The return value will either be None or a tuple of 4 QueryAtoms in clockwise order.
>>> s = SMARTSSubstructure("FC(I)O[C@](S)(P)H") >>> s.atoms[1].chirality is None True >>> s.atoms[4].chirality (QueryAtom(O)[atom aromaticity: equal to 0], QueryAtom(H), QueryAtom(P)[atom aromaticity: equal to 0], QueryAtom(S)[atom aromaticity: equal to 0])
- property cyclic¶
Constraint specifying whether or not the QueryAtom is part of a cycle.
>>> a = QueryAtom(['C', 'N']) >>> a.cyclic = True >>> print(a) QueryAtom(C, N)[atom cyclicity: equal to 1]
- property cyclic_bonds¶
Constraint specifying the number of cyclic bonds of the QueryAtom.
>>> a = QueryAtom(['C', 'N']) >>> a.cyclic_bonds = ('!=', 4) >>> print(a) QueryAtom(C, N)[number of cyclic bonds:not equal to 4]
- property donor¶
Constraint specifying whether or not the QueryAtom is a donor.
>>> a = QueryAtom(['C', 'N']) >>> a.donor = True >>> print(a) QueryAtom(C, N)[AtomDonorTypeConstraint]
- property formal_charge¶
Constraint specifying the formal charge on the QueryAtom.
>>> a = QueryAtom(['C', 'N']) >>> a.formal_charge = ('in', [-1, 1]) >>> print(a) QueryAtom(C, N)[charge: one of -1, 1]
- property formal_valency¶
Constraint specifying the formal valency of the QueryAtom.
>>> a = QueryAtom(['C', 'N']) >>> a.formal_valency = ('>', 3) >>> print(a) QueryAtom(C, N)[atom valency: greater than 3]
- property has_3d_coordinates¶
Constraint specifying that the atom has 3d coordinates.
>>> a = QueryAtom(['C', 'N']) >>> a.has_3d_coordinates = True >>> print(a) QueryAtom(C, N)[atom must have 3D site]
- property index¶
Index of this atom in a substructure.
>>> atom = QueryAtom(['C', 'N']) >>> print(atom.index) None >>> substructure = QuerySubstructure() >>> _ = substructure.add_atom(atom) >>> print(atom.index) 0
- property label_match¶
Constraint specifying that the atom label must match a regular expression.
>>> a = QueryAtom(['C']) >>> a.label_match = '^C12$' >>> print(a) QueryAtom(C)[atom label must match regular expression with pattern: ^C12$]
- property nimplicit_hydrogens¶
Constraint specifying a count of implicit hydrogens.
>>> a = QueryAtom(['C', 'N']) >>> a.nimplicit_hydrogens = 0 >>> print(a) QueryAtom(C, N)[implicit hydrogen count: equal to 0]
- property num_bonds¶
Constraint specifying the number of bonds the QueryAtom may have.
>>> a = QueryAtom(['C', 'N']) >>> a.num_bonds = ('<=', 3) >>> print(a) QueryAtom(C, N)[number of connected atoms: less than or equal to 3]
- property num_hydrogens¶
Constraint specifying the number of hydrogens the QueryAtom may have.
>>> a = QueryAtom(['C', 'N']) >>> a.num_hydrogens = 1 >>> print(a) QueryAtom(C, N)[hydrogen count, including deuterium: equal to 1]
- property smallest_ring¶
Constraint specifying the size of the smallest ring the QueryAtom forms part of.
>>> a = QueryAtom(['C', 'N']) >>> a.smallest_ring = (5, 6) >>> print(a) QueryAtom(C, N)[atom smallest ring: in range 5 to 6]
- property unfused_unbridged_ring¶
Constraint specifying whether or not the QueryAtom is part of an unfused and unbridged ring.
>>> a = QueryAtom(['C', 'N']) >>> a.unfused_unbridged_ring = True >>> print(a) QueryAtom(C, N)[atom unfused/unbridged ring: equal to 1]
- class ccdc.search.QueryBond(bond_type=None, _substructure_bond=None)[source]¶
Bond used to define a substructure search.
A QueryBond can be used to represent a single bond type or a set of bond types. A QueryBond can also have additional constraints imposed on it, for example that it should be cyclic.
Let us create a QueryBond that will match any bond type.
>>> query_bond = QueryBond() >>> print(query_bond) QueryBond(Unknown, Single, Double, Triple, Quadruple, Aromatic, Delocalised, Pi)
To create a more specific QueryBond we need to specify some bond types.
>>> from ccdc.molecule import Bond >>> single_bond = Bond.BondType('Single') >>> double_bond = Bond.BondType('Double') >>> query_bond = QueryBond(single_bond) >>> print(query_bond) QueryBond(Single) >>> query_bond = QueryBond([single_bond, double_bond]) >>> print(query_bond) QueryBond(Single, Double)
Finally, let us set a constraint for the bond to be cyclic.
>>> query_bond.cyclic = True >>> print(query_bond) QueryBond(Single, Double)[bond cyclicity: equal to 1]
>>> print(query_bond.cyclic) BondCyclicityConstraint: 1
- property atoms¶
A list of the two QueryAtoms of the bond, if it is in a substructure, or
None
.>>> s = QuerySubstructure() >>> c = s.add_atom(QueryAtom('C')) >>> n = s.add_atom(QueryAtom('N')) >>> b = QueryBond(['Single', 'Double']) >>> _ = s.add_bond(b, c, n) >>> print(b) QueryBond(Single, Double) >>> print('%s, %s' % (b.atoms[0], b.atoms[1])) QueryAtom(C), QueryAtom(N)
- property bond_length¶
Constraint specifying the length of the bond.
>>> b = QueryBond('Single') >>> c1 = QueryAtom('C') >>> c2 = QueryAtom('C') >>> s = QuerySubstructure() >>> _ = s.add_atom(c1) >>> _ = s.add_atom(c2) >>> _ = s.add_bond(b, c1, c2) >>> b.bond_length = ('>', 1.6) >>> print(b) QueryBond(Single)[bond length: greater than 1.6]
- property bond_polymeric¶
Constraint specifying whether or not the
QueryBond
is polymeric.>>> b = QueryBond('Single') >>> b.bond_polymeric = True >>> print(b) QueryBond(Single)[bond polymeric: equal to 1]
- property bond_smallest_ring¶
Constraint specifying the smallest ring the bond should be a part of.
>>> b = QueryBond('Aromatic') >>> b.bond_smallest_ring = 5 >>> print(b) QueryBond(Aromatic)[bond smallest ring: equal to 5]
- property bond_unfused_unbridged_ring¶
Constraint specifying whether or not the
QueryBond
is part of an unfused and unbridged ring.>>> b = QueryBond('Single') >>> b.bond_unfused_unbridged_ring = True >>> print(b) QueryBond(Single)[bond unfused/unbridged ring: equal to 1]
- property cyclic¶
Constraint specifying whether or not the
QueryBond
is part of a cycle.>>> b = QueryBond('Single') >>> b.cyclic = True >>> print(b) QueryBond(Single)[bond cyclicity: equal to 1]
- property stereochemistry¶
Constraint specifying the stereochemistry around a double bond.
The return value will either be None or a tuple of 2 QueryAtoms and one of ‘cis’ or ‘trans’.
>>> s = SMARTSSubstructure(r"I/C=C\F") >>> s.bonds[1].stereochemistry (QueryAtom(I), QueryAtom(F), 'cis')
- class ccdc.search.QuerySubstructure(_substructure=None)[source]¶
Class to define and run substructure searches.
As an example let us set up a QuerySubstructure for a carbonyl (C=O).
>>> from ccdc.molecule import Bond >>> double_bond = Bond.BondType('Double') >>> substructure_query = QuerySubstructure() >>> query_atom1 = substructure_query.add_atom('C') >>> query_atom2 = substructure_query.add_atom('O') >>> query_bond = substructure_query.add_bond(double_bond, query_atom1, query_atom2)
- add_atom(atom)[source]¶
Add an atom to the substructure.
- Parameters
atom – may be a QueryAtom separately constructed, an atom of a molecule, or an atomic symbol.
- Returns
>>> q = QuerySubstructure() >>> a = q.add_atom(QueryAtom(['N', 'O'])) >>> print(a) QueryAtom(N, O)
- add_bond(bond, atom1=None, atom2=None)[source]¶
Add a bond to the substructure.
- Parameters
bond – may be a
QueryBond
, accdc.molecule.Bond.BondType
, accdc.molecule.Bond
, a string or an int.atom1 –
QueryAtom
orNone
for any atomatom2 –
QueryAtom
orNone
for any atom
- Returns
- Raises
TypeError if an improper bond argument is supplied
>>> s = QuerySubstructure() >>> c = s.add_atom(QueryAtom('C')) >>> o1 = s.add_atom(QueryAtom('O')) >>> o2 = s.add_atom(QueryAtom('O')) >>> h = s.add_atom(QueryAtom('H')) >>> _ = s.add_bond(QueryBond('Double'), c, o1) >>> _ = s.add_bond(QueryBond('Single'), c, o2) >>> _ = s.add_bond(QueryBond('Single'), o2, h)
- property atoms¶
The query atoms in the substructure.
>>> q = QuerySubstructure() >>> _ = q.add_atom(QueryAtom('C')) >>> _ = q.add_atom(QueryAtom(['O', 'N'])) >>> atoms = q.atoms >>> print('%s, %s' % (atoms[0], atoms[1])) QueryAtom(C), QueryAtom(N, O)
- property bonds¶
The bonds in the substructure.
>>> s = QuerySubstructure() >>> b = s.add_bond('Single', QueryAtom('C'), QueryAtom('F')) >>> bonds = s.bonds >>> print(bonds[0]) QueryBond(Single)
- match_atom(atom, query_atom=None)[source]¶
Whether or not the given atom matches the query_atom in the given context.
- Parameters
atom – a
ccdc.molecule.Atom
instance.query_atom – a
ccdc.search.QueryAtom
instance orNone
. IfNone
, the first atom of the substructure will be used.
- Returns
bool
>>> s = QuerySubstructure() >>> _ = s.add_bond('Single', QueryAtom('Cl'), QueryAtom('C')) >>> mol = EntryReader('csd').molecule('AABHTZ') >>> s.match_atom(mol.atom('Cl1')) True >>> s.match_atom(mol.atom('C1')) False >>> s.match_atom(mol.atom('C1'), s.atoms[1]) True
- match_molecule(molecule)[source]¶
Whether or not the query matches the specified molecule.
- Parameters
molecule – a
ccdc.molecule.Molecule
instance.- Returns
bool
>>> s = QuerySubstructure() >>> _ = s.add_bond('Double', QueryAtom('C'), QueryAtom('O')) >>> mol = EntryReader('csd').molecule('AABHTZ') >>> s.match_molecule(mol) True
- nmatch_molecule(molecule)[source]¶
Returns number of query matches within the specified molecule.
- Parameters
molecule – a
ccdc.molecule.Molecule
instance.- Returns
integer
>>> s = QuerySubstructure() >>> _ = s.add_bond('Single', QueryAtom('Cl'), QueryAtom('C')) >>> mol = EntryReader('csd').molecule('AABHTZ') >>> s.nmatch_molecule(mol) 2
- class ccdc.search.SMARTSSubstructure(smarts)[source]¶
Make a substructure from a SMARTS string.
Let us create a ketone SMARTSSubstructure as an example.
>>> smarts_query = SMARTSSubstructure("[CD4][CD3](=[OD1])[CD4]") >>> print(smarts_query.smarts) [CD4][CD3](=[OD1])[CD4]
There is a minor extension to Daylight SMARTS to allow the representation of quadruple, delocalised and pi bonds, using the characters ‘_’, ‘”’ and ‘|’ respectively.
There is a second minor extension to allow easy access to the indices of the atoms.
>>> query = SMARTSSubstructure("[#6:0]([#7]-H)[#8:1][#6:2]") >>> print(query.label_to_atom_index(0)) 0 >>> print(query.label_to_atom_index(1)) 3
- label_to_atom_index(label)[source]¶
Translate a SMARTS label into the appropriate substructure atom index
- property smarts¶
The SMARTS string.
- class ccdc.search.MoleculeSubstructure(mol, match_stereochemistry=False)[source]¶
Make a substructure query from an entire molecule.
Can be used to search for exact matches of a molecule when appropraite num_bonds or add_connected_element_count constraints are set on the QueryAtoms. Furthermore if hydrogen atoms have been removed from the molecule used to initialise the MoleculeSubstructure it can be used to find hits that match the heavy atoms as a substructure.
- Parameters
mol –
ccdc.molecule.Molecule
match_stereochemistry – Should the substructure constrain target stereochemistry to match the input molecule’s stereochemistry?
- Raises
TypeError if the passed in molecule has multiple components since multi-component molecule substructure searches are not supported. The components should be added as separate substructures.
>>> mol = EntryReader('csd').molecule('AABHTZ') >>> sub = MoleculeSubstructure(mol)
- class ccdc.search.ConnserSubstructure(file_name, _conn=None)[source]¶
Read a Conquest query language file.
- static from_string(text)[source]¶
Create a substructure from a textual representation of a Connser file.
- interaction_library_contact_atoms()[source]¶
Provide the list of indexes of atoms into the substructure (optionally) defined in the ConnSer query for generating the data in the CCDC interaction library
The list of indexes are into the list of substructure atoms with the associated substructure
see
ccdc.interactions
for more information on the interaction library
Search classes¶
- class ccdc.search.Search(settings=None)[source]¶
Common base class for searches
- class SearchHit(identifier, _database=None, _entry=None, _crystal=None, _molecule=None, _binary_database=None)[source]¶
Base class for search hits.
Provides access to molecules, crystals and entries.
- property crystal¶
The crystal corresponding to a search hit.
- property entry¶
The entry corresponding to a search hit.
- property identifier¶
The string identifier of the hit.
- property molecule¶
The molecule corresponding to a search hit.
- class Settings(_settings=None)[source]¶
Base class for search settings.
- property has_3d_coordinates¶
Constrain hits to have 3d coordinates.
- property max_hit_structures¶
The number of structures which may be returned from a search.
- property max_r_factor¶
Constrain the hits to have an R-factor less than this.
The R-factor will be expressed as a percentage.
- property must_have_elements¶
Elements which must be present in a hit.
The elements will be presented as a list of atomic symbols.
>>> settings = Search.Settings() >>> settings.must_have_elements = ['C', 'N', 'O', 'S'] >>> print(settings.must_have_elements) [C (6), N (7), O (8), S (16)]
- property must_not_have_elements¶
Elements which must not be present in a hit.
The elements will be presented as a list of symbols.
>>> settings = Search.Settings() >>> settings.must_not_have_elements = ['S', 'P', 'K'] >>> print(settings.must_not_have_elements) [P (15), S (16), K (19)]
- property no_disorder¶
Constrain hits to have no disorder.
The value will be False (no filtering), ‘Non-hydrogen’ (filter structures with heavy atom disorder) or ‘All’ (filter structures with any disordered atoms).
- property no_errors¶
Constrain the hits to have no suppressed errors.
- property no_ions¶
Constrain the hits not to have a residue with a formal charge. The hits may include zwitterions.
- property no_metals¶
Constrain the hits not to have a metal atom.
- property no_powder¶
Constrain hits not to be powder studies.
- property not_polymeric¶
Constrain the hits not to be polymeric structures.
- property only_organic¶
Constrain hits to be organic compounds.
- property only_organometallic¶
Constrain hits to be only organometallic compounds.
- test(argument)[source]¶
Test that the argument satisfies the requirements of the settings instance.
- Parameters
argument – a
ccdc.entry.Entry
,ccdc.crystal.Crystal
orccdc.molecule.Molecule
instance.- Returns
bool
>>> entry = EntryReader('csd').entry('AABHTZ') >>> settings = Search.Settings() >>> settings.test(entry) True >>> settings.only_organometallic = True >>> settings.test(entry) False
- class ccdc.search.SimilaritySearch(mol=None, threshold=0.7, coefficient='tanimoto', settings=None)[source]¶
Class to define and run similarity searches.
- class Settings(threshold=0.7, coefficient='tanimoto', _settings=None)[source]¶
- property coefficient¶
This should be either ‘dice’ or ‘tanimoto’, the default.
- property sort_order¶
The order in which hits will be sorted.
THis should be either ‘alphabetic’ or ‘value’, the default.
- property threshold¶
The similarity threshold to apply.
This is a value between 0.0 and 1.0.
- class SimilarityHit(similarity, identifier, _database=None, _entry=None, _crystal=None, _molecule=None, _binary_database=None)[source]¶
A search hit recording the similarity measure.
The SimilarityHit instance will give access to the identifier of the hit, the value of the similarity to the query molecule, the entry, crystal or molecule of the hit.
- property coefficient¶
Which coefficient to use when determining similarity.
- static from_xml(xml)[source]¶
Create a SimilaritySearch from an XML representation.
- Parameters
xml – XML string
- static from_xml_file(file_name)[source]¶
Create a SimilaritySearch from an XML file.
- Parameters
file_name – path to XML file
- Raises
IOError when the file does not exist
- property molecule¶
The query molecule.
- read_xml_file(file_name)[source]¶
Read an XML file into the similarity searcher.
- Parameters
file_name – path to XML file
- Raises
IOError if the file cannot be read
- search_molecule(mol)[source]¶
Search a molecule.
This can be used to determine a similarity coefficient against the given molecule.
- Parameters
mol –
ccdc.molecule.Molecule
- Returns
>>> csd = EntryReader('csd') >>> ibuprofen = csd.molecule('HXACAN') >>> searcher = SimilaritySearch(ibuprofen) >>> hit = searcher.search_molecule(csd.molecule('IBPRAC')) >>> print(round(hit.similarity, 3)) 0.161
- property threshold¶
The similarity threshold to use.
- class ccdc.search.TextNumericSearch(settings=None)[source]¶
Class to define and run text/numeric searches in a crystal structure database.
It is possible to add one or more criterion for the query to match.
>>> text_numeric_query = TextNumericSearch() >>> text_numeric_query.add_compound_name('aspirin') >>> text_numeric_query.add_citation(year=[2011, 2013]) >>> for hit in text_numeric_query.search(max_hit_structures=3): ... print(hit.identifier) ... ACSALA19 ACSALA20 ACSALA21
A human-readable representation of the queries may be obtained: >>> print(’, ‘.join(q for q in text_numeric_query.queries)) Compound name aspirin anywhere , Journal year in range 2011-2013
- class TextNumericSearchSettings(_settings=None)[source]¶
No settings apart from those provided by the base class required.
- add_all_identifiers(refcode, mode='anywhere', ignore_non_alpha_num=False)[source]¶
Search for an identifier, including previous identifiers.
>>> from ccdc.search import TextNumericSearch >>> query = TextNumericSearch() >>> query.add_all_identifiers('DABHUJ') >>> hits = query.search() >>> print(hits[0].identifier) ACPRET03 >>> print(hits[0].entry.previous_identifier) DABHUJ
- add_all_text(txt, mode='anywhere', ignore_non_alpha_num=False)[source]¶
Search for text anywhere in the entry.
- add_analogue(analogue, mode='anywhere', ignore_non_alpha_num=False)[source]¶
Search for an analogue.
- add_bioactivity(activity, mode='anywhere', ignore_non_alpha_num=False)[source]¶
Search for a particular bio-activity.
- add_ccdc_number(value)[source]¶
Search for a particular or a range of CCDC deposition numbers.
>>> from ccdc.search import TextNumericSearch >>> searcher = TextNumericSearch() >>> searcher.add_ccdc_number(241370) >>> hits = searcher.search() >>> len(hits) 1 >>> entry = hits[0].entry >>> print('%s %s' % (entry.identifier, entry.ccdc_number)) ABEBUF 241370 >>> searcher.clear() >>> searcher.add_ccdc_number((241368, 241372)) >>> hits = searcher.search() >>> print(len(hits)) 3 >>> for hit in hits: ... print('%s %s' % (hit.identifier, hit.entry.ccdc_number)) ... ABEBUF 241370 BIBZIW 241371 BIMGEK 241372
- add_citation(author='', journal='', volume=None, year=None, first_page=None, ignore_non_alpha_num=False, _coden=None)[source]¶
Search for a citation.
Note: the journal parameter requires the CSD to be present in order to translate the journal name to a coden identifier. If the CSD is not present, but an alternative database is, use the alternative database’s journals dict to look up a coden identifier and specify the _coden parameter in this function.
- add_color(color, mode='anywhere', ignore_non_alpha_num=False)[source]¶
Search for a particular colour.
- add_compound_name(compound_name, mode='anywhere', ignore_non_alpha_num=False)[source]¶
Search for a compound name.
The search checks the content both of
ccdc.entry.Entry.chemical_name
andccdc.entry.Entry.synonyms
.To illustrate this let us have a look at the CSD entry
ABABEM
.>>> from ccdc.io import EntryReader >>> entry_reader = EntryReader('CSD') >>> ababem = entry_reader.entry('ABABEM') >>> print(ababem.chemical_name) Tetrahydro[1,3,4]thiadiazolo[3,4-a]pyridazine-1,3-dione >>> print(ababem.synonyms[0]) 8-Thia-1,6-diazabicyclo[4.3.0]nonane-7,9-dione
The text
azabicyclo[4.3.0]nonane
is only found in the synonym. Let us search for it using a compound name search.>>> from ccdc.search import TextNumericSearch >>> query = TextNumericSearch() >>> query.add_compound_name('azabicyclo[4.3.0]nonane') >>> hits = query.search()
Finally let us assert that we have found
ABABEM
.>>> assert(u'ABABEM' in [h.identifier for h in hits])
- add_disorder(disorder, mode='anywhere', ignore_non_alpha_num=False)[source]¶
Search for a disorder comment.
- add_habit(habit, mode='anywhere', ignore_non_alpha_num=False)[source]¶
Search for a particular habit.
- add_heat_capacity_notes(heat_capacity_notes, mode='anywhere', ignore_non_alpha_num=False)[source]¶
Search for heat capacity notes.
- add_heat_of_fusion_notes(heat_of_fusion_notes, mode='anywhere', ignore_non_alpha_num=False)[source]¶
Search for heat of fusion notes.
- add_peptide_sequence(peptide_sequence, mode='anywhere', ignore_non_alpha_num=False)[source]¶
Search for a peptide sequence.
- add_phase_transition(phase_transition, mode='anywhere', ignore_non_alpha_num=False)[source]¶
Search for a phase transition.
- add_polymorph(polymorph, mode='anywhere', ignore_non_alpha_num=False)[source]¶
Search for polymorph information.
- add_predicted_semiconductor_dynamic_disorder(value)[source]¶
Search for predicted semiconductor dynamic disorder.
See
ccdc.entry.SemiconductorPredictedProperties.dynamic_disorder
- add_predicted_semiconductor_hole_reorganization_energy(value)[source]¶
Search for predicted semiconductor hole reorganization energy.
See
ccdc.entry.SemiconductorPredictedProperties.hole_reorganization_energy
- add_predicted_semiconductor_homo_lumo_gap(value)[source]¶
Search for predicted semiconductor HOMO-LUMO gap.
See
ccdc.entry.SemiconductorPredictedProperties.homo_lumo_gap
- add_predicted_semiconductor_singlet_state_1_energy(value)[source]¶
Search for predicted semiconductor singlet state 1 energy.
See
ccdc.entry.SemiconductorPredictedProperties.singlet_state_1_energy
- add_predicted_semiconductor_singlet_state_1_oscillator_strength(value)[source]¶
Search for predicted semiconductor singlet state 1 oscillator strength.
See
ccdc.entry.SemiconductorPredictedProperties.singlet_state_1_oscillator_strength
- add_predicted_semiconductor_singlet_state_2_energy(value)[source]¶
Search for predicted semiconductor singlet state 2 energy.
See
ccdc.entry.SemiconductorPredictedProperties.singlet_state_2_energy
- add_predicted_semiconductor_singlet_state_2_oscillator_strength(value)[source]¶
Search for predicted semiconductor singlet state 2 oscillator strength.
See
ccdc.entry.SemiconductorPredictedProperties.singlet_state_2_oscillator_strength
- add_predicted_semiconductor_transfer_integral(value)[source]¶
Search for predicted semiconductor transfer integral.
See
ccdc.entry.SemiconductorPredictedProperties.transfer_integral
- add_predicted_semiconductor_triplet_state_1_energy(value)[source]¶
Search for predicted semiconductor triplet state 1 energy.
See
ccdc.entry.SemiconductorPredictedProperties.triplet_state_1_energy
- add_predicted_semiconductor_triplet_state_2_energy(value)[source]¶
Search for predicted semiconductor triplet state 2 energy.
See
ccdc.entry.SemiconductorPredictedProperties.triplet_state_2_energy
- add_solubility_notes(solubility_notes, mode='anywhere', ignore_non_alpha_num=False)[source]¶
Search for solubility notes.
- add_source(source, mode='anywhere', ignore_non_alpha_num=False)[source]¶
Search for a source.
>>> from ccdc.search import TextNumericSearch >>> searcher = TextNumericSearch() >>> searcher.add_source('toad') >>> hits = searcher.search(max_hit_structures=5) >>> for h in hits: ... print('%-8s: %s' % (h.identifier, h.entry.source)) ... CUXYAV : Ch'an Su (dried venom of Chinese toad) EWAWUW : isolated from the eggs of toad Bufo bufo gargarizans EWAXAD : isolated from the eggs of toad Bufo bufo gargarizans FIFDUT : dried venom of Chinese toad Ch'an Su FIFFAB : dried venom of Chinese toad Ch'an Su
- add_spacegroup_symbol(spacegroup_symbol, mode='anywhere', ignore_non_alpha_num=False)[source]¶
Search for a spacegroup symbol or any alias of that symbol.
- static from_xml_file(file_name)[source]¶
Create a TextNumericSearch from an XML file.
- Parameters
file_name – path to XML file
- Raises
IOError when the file does not exist
- is_journal_valid(journal)[source]¶
Check the validity of a specified journal name in the CSD.
This requires the CSD to be present.
- Parameters
journal – str, journal name
- property journals¶
A dictionary of journal name : ccdc code number for journals in the CSD.
This requires the CSD to be present.
- property queries¶
The current set of queries for this search.
>>> tns = TextNumericSearch() >>> tns.add_all_text('ibuprofen') >>> tns.add_author('Haisa') >>> print('; '.join(str(q).strip() for q in tns.queries)) All text ibuprofen anywhere; Author Haisa anywhere
- class ccdc.search.SubstructureSearch(settings=None)[source]¶
Query crystal structures for interactions.
- class HitProcessor[source]¶
Override this class to provide your own add_hit() method.
This class allows a search to process hits as they are found by the search class, rather than waiting until all hits are found before allowing access to them, a procedure which may well run out of memory for very general searches.
- search(searcher, database=None)[source]¶
Searches the database with the substructure search.
- Parameters
searcher – a
ccdc.search.SubstructureSearch
instance.database – a
ccdc.io.EntryReader
instance. If not specified the CSD will be searched.
For each hit found,
ccdc.Search.SubstructureSearch.HitProcessor.add_hit()
will be called with accdc.search.SubstructureSearch.SubstructureHit
instance.
- class Settings(max_hit_structures=None, max_hits_per_structure=None)[source]¶
Settings appropriate to a substructure search.
- property match_enantiomers¶
Enantiomer matching behavior
The value will be one of ‘NEVER’ meaning enantiomers are never checked, ‘SPACEGROUP_DEPENDENT’ meaning enantiomers are checked if the crystal’s spacegroup implies the presence of enantiomers, or ‘ALWAYS’ meaning enantiomers are always checked.
- property max_hits_per_structure¶
Maximum number of hits per structure.
- class SubstructureHit(identifier, match=None, search_structure=None, query=None, _database=None, _entry=None, _crystal=None, _molecule=None, _binary_database=None)[source]¶
A hit from a substructure search.
- centroid_objects(name)[source]¶
The geometric object names and atoms from which the centroid was defined.
- constraint_atoms(name)[source]¶
The atoms from which the constraint was defined.
- Parameters
name – the name of the constraint.
- Returns
a tuple of
ccdc.molecule.Atom
instances.
The atoms will be returned in an arbitrary order. All atoms involved in defining the constraint will be returned.
- constraint_objects(constraint)[source]¶
A tuple of object names and atoms from which the constraint was defined.
- dummy_point_objects(name)[source]¶
The geometric object names and atoms from which the dummy point was defined.
- measurement_atoms(name)[source]¶
The atoms involved in a measurement.
- Parameters
name – the name of the measurement.
- Returns
a tuple of
ccdc.molecule.Atom
instances.
The atoms will be returned in an arbitrary order. All atoms involved in the measurement will be present, so for example a centroid-centroid distance measurement will produce the atoms of both centroids.
- class SubstructureHitList(iterable=(), /)[source]¶
List of hits from a
ccdc.search.SubstructureSearch
- add_angle_constraint(name, *args)[source]¶
Add an angle constraint.
- Parameters
name – by which the constraint will be accessed.
*args – three instances either of a pair (substructure_index, atom_index) or of names of geometric objects.
range – as for
ccdc.search.SubstructureSearch.add_distance_constraint()
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_centroid('CENT1', (0, 0), (0, 1), (0, 2)) >>> query.add_centroid('CENT2', (1, 0), (1, 1), (1, 2)) >>> query.add_angle_constraint('ANG1', (0, 0), (1, 1), (1, 0), ('>=', 120))
- add_angle_measurement(name, *args)[source]¶
Add an angle measurement.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_centroid('CENT1', (0, 0), (0, 1), (0, 2)) >>> query.add_centroid('CENT2', (1, 0), (1, 1), (1, 2)) >>> query.add_angle_measurement('ANG1', (0, 0), (1, 1), (1, 0))
- add_atom_property_constraint(name, *args, **kw)[source]¶
Add an atom property constraint.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('[*H1]')) >>> query.add_atom_property_constraint('ATOM1', (0, 0), ('in', [7, 8]), which='AtomicNumber')
- add_atom_property_measurement(name, *args, **kw)[source]¶
Add an atom property measurement.
- Parameters
name – the name by which this measurement will be accessed.
*args – a pair, (substructure_index, atom_index) specifying the atom to measure.
which – one of TotalCoordinationNumber, AtomicNumber, VdwRadius, CovalentRadius
>>> query = SubstructureSearch() >>> substructure = QuerySubstructure() >>> _ = substructure.add_atom(['C', 'N']) >>> _ = query.add_substructure(substructure) >>> query.add_atom_property_measurement('ATOM1', (0, 0), which='AtomicNumber')
- add_binary_transform_constraint(name, which, *args)[source]¶
Add a binary arithmetical calculation constraint.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_vector('VEC1', (0, 1), (1, 2)) >>> query.add_vector('VEC2', (0, 2), (1, 1)) >>> query.add_vector_angle_measurement('ANG1', 'VEC1', 'VEC2') >>> query.add_constant_value_measurement('D2R', 180/3.14159) >>> query.add_binary_transform_constraint('IN_RADIANS', 'MUL', 'ANG1', 'D2R', (-1, 1))
- add_binary_transform_measurement(name, which, arg1, arg2)[source]¶
Add a binary mathematical operation.
- Parameters
name – the name by which this value will be accessed.
which – one of ‘MAX’, ‘MIN’, ‘ADD’, ‘SUBTRACT’, ‘MULTIPLY’, ‘DIVIDE’, ‘POW’, ‘RSIN’, ‘RCOS’.
arg2 (arg1,) – the name of a measurement to be used as arguments to the operator.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_vector('VEC1', (0, 1), (1, 2)) >>> query.add_vector('VEC2', (0, 2), (1, 1)) >>> query.add_vector_angle_measurement('ANG1', 'VEC1', 'VEC2') >>> query.add_constant_value_measurement('D2R', 180/3.14159) >>> query.add_binary_transform_measurement('IN_RADIANS', 'MUL', 'ANG1', 'D2R')
- add_centroid(name, *args)[source]¶
Adds a centroid to the substructure search.
- Parameters
name – the name by which the centroid will be accessed.
*args – the points or geometric objects from which to define the centroid.
Each arg may be either a pair (substructure_index, atom_index) or the name of a geometric object. There must be at least two such arguments.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_centroid('CENT1', (0, 0), (0, 1), (0, 2)) >>> query.add_centroid('CENT2', (1, 0), (1, 1), (1, 2)) >>> query.add_centroid('CENT3', 'CENT1', 'CENT2')
- add_constant_value_measurement(name, value)[source]¶
Add a constant value.
- Parameters
name – the name by which this constant will be accessed.
value – a float.
>>> query = SubstructureSearch() >>> substructure = QuerySubstructure() >>> _ = substructure.add_atom(['C', 'N']) >>> _ = query.add_substructure(substructure) >>> query.add_constant_value_measurement('PI', 3.14159)
- add_distance_constraint(name, *args, **kw)[source]¶
Add a distance constraint.
- param name
the name of this constraint.
- param *args
specifications of points either as pairs (substructure_index, atom_index) or as names of geometric measurements.
- param range
a condition, either as a pair of floats or a pair (operator, value) where operator may be
‘==’, ‘>’, ‘<’, ‘>=’, ‘<=’, ‘!=’ or a pair (‘in’, list(values)).
- param intermolecular
whether or not the distance should be within a unit cell molecule or between a unit cell molecule and a packing shell molecule.
- param vdw_corrected
whether the distance range should be relative to the Van der Waals radii of the atoms involved.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_distance_constraint('DIST1', (0, 1), (1, 1), (-5, 0), vdw_corrected=True, type='any') >>> query.add_distance_constraint('DIST2', (0, 2), (1, 2), ('<=', 3.0), vdw_corrected=True, type='any')
- add_distance_measurement(name, *args)[source]¶
Add a distance measurement.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_centroid('CENT1', (0, 0), (0, 1), (0, 2)) >>> query.add_centroid('CENT2', (1, 0), (1, 1), (1, 2)) >>> query.add_distance_measurement('DIST1', (0, 0), 'CENT2')
- add_dummy_point(name, distance, *args)[source]¶
Creates a dummy point along a vector.
- Parameters
name – the name by which this point will be accessed.
distance – the distance along the vector subtentended by the two points.
*args – two points specified as (substructure_index, atom_index) or the name of another geometric object.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_centroid('CENT1', (0, 0), (0, 1), (0, 2)) >>> query.add_dummy_point('DUM1', 2.0, 'CENT1', (1, 1))
- add_group(name, *args)[source]¶
Creates a group of matched atoms.
- Parameters
name – the name by which this group will be accessed.
*args – pairs, (substructure_index, atom_index) defining the atoms of the group.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_group('GP1', (0, 0), (0, 1), (0, 2))
- add_plane(name, *args)[source]¶
Add a plane.
- Parameters
name – the name by which the plane will be accessed.
*args – at least two point specifications in the form (substructure_index, atom_index) or the name of another geometric object.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_plane('PLANE1', (0, 0), (0, 1), (0, 2)) >>> query.add_plane('PLANE2', (1, 0), (1, 1), (1, 2))
- add_plane_angle_constraint(name, *args)[source]¶
Add a plane angle constraint.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_plane('PLANE1', (0, 0), (0, 1), (0, 2)) >>> query.add_plane('PLANE2', (1, 0), (1, 1), (1, 2)) >>> query.add_plane_angle_constraint('PA1', 'PLANE1', 'PLANE2', (-10, 10))
- add_plane_angle_measurement(name, *args)[source]¶
Add a plane angle measurement.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_plane('PLANE1', (0, 0), (0, 1), (0, 2)) >>> query.add_plane('PLANE2', (1, 0), (1, 1), (1, 2)) >>> query.add_plane_angle_measurement('PA1', 'PLANE1', 'PLANE2')
- add_point_plane_distance_constraint(name, *args)[source]¶
Add a point plane distance constraint.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_centroid('CENT1', (0, 0), (0, 1), (0, 2)) >>> query.add_plane('PLANE2', (1, 0), (1, 1), (1, 2)) >>> query.add_point_plane_distance_constraint('PP1', 'CENT1', 'PLANE2', ('<', 5))
- add_point_plane_distance_measurement(name, *args)[source]¶
Add point plane distance measurement.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_centroid('CENT1', (0, 0), (0, 1), (0, 2)) >>> query.add_plane('PLANE2', (1, 0), (1, 1), (1, 2)) >>> query.add_point_plane_distance_measurement('PP1', 'CENT1', 'PLANE2')
- add_substructure(substructure)[source]¶
Add a substructure.
Disconnected substructures may be accepted if the first substructure is contiguous at the start. Multiple substructures may be added as a result.
- Parameters
substructure –
ccdc.search.QuerySubstructure
.- Returns
the index of the first substructure added.
- add_torsion_angle_constraint(name, *args)[source]¶
Add a torsion angle constraint.
- Parameters
name – the name by which this constraint is accessed.
*args – as for
ccdc.search.SubstructureSearch.add_distance_constraint()
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_centroid('CENT1', (0, 0), (0, 1), (0, 2)) >>> query.add_centroid('CENT2', (1, 0), (1, 1), (1, 2)) >>> query.add_torsion_angle_constraint('ANG1', (0, 0), (0, 1), (1, 1), (1, 0), (120, 180))
- add_torsion_angle_measurement(name, *args)[source]¶
Add a torsion angle measurement.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_centroid('CENT1', (0, 0), (0, 1), (0, 2)) >>> query.add_centroid('CENT2', (1, 0), (1, 1), (1, 2)) >>> query.add_torsion_angle_measurement('ANG1', (0, 0), (0, 1), (1, 1), (1, 0))
- add_unary_transform_constraint(name, *args)[source]¶
Add an arithmetical calculation constraint.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_vector('VEC1', (0, 1), (1, 2)) >>> query.add_vector('VEC2', (0, 2), (1, 1)) >>> query.add_vector_angle_measurement('ANG1', 'VEC1', 'VEC2') >>> query.add_unary_transform_constraint('ABS_ANGLE', 'ABS', 'ANG1', (0, 10))
- add_unary_transform_measurement(name, which, arg)[source]¶
Add a mathematical operation.
- Parameters
name – name by which the result will be accessed.
which – one of ‘ABS’, ‘LOG’, ‘LOG10’, ‘EXP’, ‘COS’, ‘SIN’, ‘TAN’, ‘ACOS’, ‘ASIN’, ‘ATAN’, ‘FLOOR’, ‘ROUND’, ‘SQRT’, ‘NEG’.
arg – the name of the measurement or constraint to which to apply the function.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_vector('VEC1', (0, 1), (1, 2)) >>> query.add_vector('VEC2', (0, 2), (1, 1)) >>> query.add_vector_angle_measurement('ANG1', 'VEC1', 'VEC2') >>> query.add_unary_transform_measurement('ABS_ANGLE', 'ABS', 'ANG1')
- add_vector(name, *args)[source]¶
Add a vector.
- Parameters
name – the name by which the vector will be accessed.
*args – two point specifications as (substructure_index, atom_index) or the name of another geometric object.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_centroid('CENT1', (0, 0), (0, 1), (0, 2)) >>> query.add_vector('VEC1', 'CENT1', (1, 2))
- add_vector_angle_constraint(name, *args)[source]¶
Add a vector angle constraint.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_vector('VEC1', (0, 1), (1, 2)) >>> query.add_vector('VEC2', (0, 2), (1, 1)) >>> query.add_vector_angle_constraint('ANG1', 'VEC1', 'VEC2', (0, 60))
- add_vector_angle_measurement(name, *args)[source]¶
Add a vector angle measurement.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_vector('VEC1', (0, 1), (1, 2)) >>> query.add_vector('VEC2', (0, 2), (1, 1)) >>> query.add_vector_angle_measurement('ANG1', 'VEC1', 'VEC2')
- add_vector_plane_angle_constraint(name, *args)[source]¶
Add a vector plane angle constraint.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_vector('VEC1', (0, 1), (1, 2)) >>> query.add_plane('PLANE2', (1, 0), (1, 1), (1, 2)) >>> query.add_vector_plane_angle_constraint('ANG1', 'VEC1', 'PLANE2', ('>', 90))
- add_vector_plane_angle_measurement(name, *args)[source]¶
Add a vector plane angle measurement.
>>> query = SubstructureSearch() >>> _ = query.add_substructure(SMARTSSubstructure('C(=O)O')) >>> _ = query.add_substructure(SMARTSSubstructure('N(-H)H')) >>> query.add_vector('VEC1', (0, 1), (1, 2)) >>> query.add_plane('PLANE2', (1, 0), (1, 1), (1, 2)) >>> query.add_vector_plane_angle_measurement('ANG1', 'VEC1', 'PLANE2')
- static from_xml(xml)[source]¶
Create a substructure search from XML. Deprecated.
- Parameters
xml – XML string
- class ccdc.search.ReducedCellSearch(query=None, settings=None)[source]¶
Provide reduced cell searches.
- class Settings(_settings=None)[source]¶
Settings appropriate to a reduced cell search.
- property absolute_angle_tolerance¶
The absolute angle tolerance.
- property is_normalised¶
Whether the input cell is normalised.
- property percent_length_tolerance¶
The cell length tolerance as a percentage of the longest cell dimension.
- compare_cells(r0, r1)[source]¶
Compare two reduced cells.
- Parameters
r0 – the first reduced cell, an instance of
ccdc.crystal.Crystal.ReducedCell
r1 – the second reduced cell similarly
- Returns
boolean
- static from_xml(xml)[source]¶
Construct a reduced cell search from an XML representation.
- Parameters
xml – XML string
- static from_xml_file(file_name)[source]¶
Construct a reduced cell search from an XML file.
- Parameters
file_name – path to XML file
- Raises
IOError when the file does not exist
- class ccdc.search.CombinedSearch(expression, settings=None)[source]¶
Boolean combinations of other searches.
TextNumericSearch, SubstructureSearch, SimilaritySearch and ReducedCellSearch can be combined using and, or and not to provide a combined search.
>>> csd = io.EntryReader('csd') >>> tns = TextNumericSearch() >>> tns.add_compound_name('Aspirin') >>> sub_search = SubstructureSearch() >>> _ = sub_search.add_substructure(SMARTSSubstructure('C(=O)OH')) >>> rcs = ReducedCellSearch(ReducedCellSearch.CrystalQuery(csd.crystal('ACSALA'))) >>> combi_search = CombinedSearch(tns & (-rcs | -sub_search)) >>> hits = combi_search.search() >>> print(len(hits)) 89