# Hydrogen Bond Statistics¶

This module enables Hydrogen Bond geometries assessment based on CSD statistics. The model encapsulates information regarding the number of hydrogen bond interactions present in the crystal structures and based on statistical information from CSD. It provides information about hydrogen bond geometries, including assessment of whether the bond distance and angles are usual or not unusual. To run it we will import the appropriate module:

```
>>> from ccdc import io
>>> from ccdc.interaction import HydrogenBondStatisticsSearch
```

## Basic operation¶

First, let us create an `ccdc.interaction.HydrogenBondStatisticsSearch`

instance:

```
>>> hbs = HydrogenBondStatisticsSearch()
```

The first thing to do is to set a target crystal; here we have chosen ADRENL which is the refcode of the skeleton molecule in Adrenaline, a hormone and medication which is involved in regulating visceral functions.

```
>>> csd = io.EntryReader('CSD')
>>> myentry = csd.entry('ADRENL')
```

## Hydrogen Bond interactions search¶

Having set a target, the `ccdc.interaction.HydrogenBondStatisticsSearch`

now has a search attribute
`ccdc.interaction.HydrogenBondStatisticsSearch.search`

from which the number of hydrogen bond interactions can be called.
Let’s now print the amount of hydrogen bond interactions found during the HBS search.

```
>>> stats = hbs.search(myentry)
>>> print('Number of Hydrogen bonds identified:', len(stats))
Number of Hydrogen bonds identified: 4
```

The statistical information for hydrogen interactions can be inspected using the various attributes under search function. For example we can retrieve the list of hydrogen bonds and the functional groups used in the search.

```
>>> for hb in stats:
... print(f'Donor: {hb.donor_atom} Acceptor: {hb.acceptor_atom} Acceptor functional group: {hb.acceptor_functional_group_name} Donor functional group: {hb.donor_functional_group_name}')
Donor: Atom(N1) Acceptor: Atom(O1) Acceptor functional group: ar_CsingleO Donor functional group: al_sec_ammonium
Donor: Atom(N1) Acceptor: Atom(O1) Acceptor functional group: ar_CsingleO Donor functional group: al_sec_ammonium
Donor: Atom(O3) Acceptor: Atom(O1) Acceptor functional group: ar_CsingleO Donor functional group: al_hydroxy_2
Donor: Atom(O2) Acceptor: Atom(O3) Acceptor functional group: al_hydroxy_2 Donor functional group: ar_oh
```

Other attributes that can be called are the distance between the donor and acceptor groups, the angle between D-H…A as well as an assessment for both distance and angle observations using distance_unusual and angle_unusual.

```
>>> for hb in stats:
... print(f'Donor: {hb.donor_atom} Acceptor: {hb.acceptor_atom} Distance D-A: {round(hb.distance, 2)} Angle D-H…A: {round(hb.angle, 2)} Distance unusual: {hb.distance_unusual} Angle unusual: {hb.angle_unusual}')
Donor: Atom(N1) Acceptor: Atom(O1) Distance D-A: 2.73 Angle D-H…A: 164.95 Distance unusual: False Angle unusual: False
Donor: Atom(N1) Acceptor: Atom(O1) Distance D-A: 2.7 Angle D-H…A: 162.55 Distance unusual: False Angle unusual: False
Donor: Atom(O3) Acceptor: Atom(O1) Distance D-A: 2.66 Angle D-H…A: 159.22 Distance unusual: False Angle unusual: False
Donor: Atom(O2) Acceptor: Atom(O3) Distance D-A: 2.81 Angle D-H…A: 157.63 Distance unusual: False Angle unusual: False
```

Statistical information can also be called for a specific interaction in the list. Here is an example of how information for the third hydrogen bond interaction for ADRENL can be called.

```
>>> hb = stats[3]
>>> print('Acceptor:', hb.acceptor_atom, 'Donor:', hb.donor_atom)
Acceptor: Atom(O3) Donor: Atom(O2)
>>> print('Acceptor functional group:', hb.acceptor_functional_group_name, 'Donor functional group:', hb.donor_functional_group_name)
Acceptor functional group: al_hydroxy_2 Donor functional group: ar_oh
>>> print('Distance D-A:', round(hb.distance, 2), 'Angle D-H…A:', round(hb.angle, 2))
Distance D-A: 2.81 Angle D-H…A: 157.63
>>> print('Distance unusual:', hb.distance_unusual, 'Angle unusual:', hb.distance_unusual)
Distance unusual: False Angle unusual: False
>>> print('Number of observed hits:', len(hb.observations))
Number of observed hits: 132
```

The entries used for each interaction can be called using the observations attribute, here we call the first five observed hits for the third interaction.

```
>>> for ob in hb.observations[:5]:
... print(ob.identifier)
ADRENL
ADRTAR
ADRTAR
ANUVEM
ASOTIM
```

## Plotting HBS distance and angle data as Histograms¶

To plot histograms, you first need to import the third party libraries matplotlib and numpy

```
>>> import matplotlib.pyplot as plt
>>> import numpy as np
```

The data used to plot the histogram data is already binned for users. The axis properties for the histograms is also precalculated in order to match as closely as possible to what is plotted in Mercury. For example, the x-axis properties for the distance histogram of the fourth bond in ADRENL can be obtained by running:

```
>>> hb = stats[3]
>>> hb.distance_histogram_x_axis_properties
(2.2999999999999994, 3.3000000000000003, 44.0)
```

The first two values of the tuple returned by this function `ccdc.interaction.HydrogenBondStatisticsSearch.HydrogenBondStatistics.distance_histogram_x_axis_properties()`

are the minimum and maximum values of the x-axis, and the third number is the number of bins the data will be binned into.

The next step is to generate the various binedges for matplotlib using numpy. We do this by running this command:

```
>>> distance_axis_properties = hb.distance_histogram_x_axis_properties
>>> step = (distance_axis_properties[1] - distance_axis_properties[0])/distance_axis_properties[2]
>>> binedges = np.arange(distance_axis_properties[0], distance_axis_properties[1], step)
```

Sometimes the array of values generated by numpy for the binedges fall slightly short because the maximum value is not added. When such a case happens, we have to manually append the maximum value to the binedges array using:

```
>>> binedges = np.append(binedges, distance_axis_properties[1])
```

We can then plot the distance histogram for the fourth bond by running:

```
>>> plt.hist(binedges[:-1], binedges, weights=hb.distance_histogram_frequencies)
```

This gives us the raw distance histogram without the radial and cone correction applied. Radial and Cone corrections are applied as weights to the frequency counts of the histograms. To apply these corrections with the weights we first have to multiply the weights to the frequency counts elementwise to get the effective densities:

```
>>> distance_frequencies_radial_corrected = np.multiply(hb.distance_histogram_weights, hb.distance_histogram_frequencies)
```

We can then plot the corrected distance histogram for the fourth bond by running:

```
>>> plt.hist(binedges[:-1], binedges, weights=distance_frequencies_radial_corrected)
```

Alternatively we can use the precomputed effective densities to plot the radially corrected distances by running:

```
>>> plt.hist(binedges[:-1], binedges, weights=hb.distance_histogram_effective_densities)
```