File format specifications

FDAT format specification

The FDAT file is the numeric data file used by the program RPluto.

The file consists of formatted 80-byte records containing the information fields of the DATA category of CSD.

Overall Entry Structure in FDAT

Each entry is composed of up to 7 possible record types.

Record type 1 is mandatory but all other record types are optional.

This is because the CSD files contain a number of entries for which some or all of the numerical data was not reported in the relevant publication.

Record types are:

  1. Directory information
  2. Unit cell parameters
  3. Textual information
  4. Symmetry positions
  5. Radius values
  6. Atomic coordinates
  7. Crystallographic connection table

Record types 1, 2 and 5 are single-line records. Other types may need more than one line to encompass the information.

Record type 4 is normally present only if the space group and record type 6 are both present.

Record type 5 is normally present only when type 6 is present.

Record type 7 is present only when record type 6 is present and the entry is error-free.

Record Type 1: Directory Information

The format of the record is: (1H#,A8,2I1,I6,6X,11I3,22I1,I2)

The contents of the record are:

Cols.	1-9	#REFCOD	Start-of-entry marker # and CSD reference code.
	10	SYS	Crystal system:	

				0	=	unknown
				1	=	anorthic
				2	=	monoclinic
				3	=	orthorhombic
				4	=	tetragonal
				5	=	hexagonal or trigonal with P-lattice type>
				6	=	cubic
				7	=	trigonal with R-lattice symbol

	11	CAT	Structural category, always 3 at present.
	12-17	ADAT	Accession date, eg. 890401 = 1st April 1989;
			all entries prior to 31 Dec 1971 have accession date 711231.
	18-23	IW	Not used.
	24-26	NCARDS 	No. of records in entry, including directory.
	27-29	NRFAC	No. of characters in R-factor field (record 3)>
	30-32	NREM	No. of characters in text REMARK field (record 3)
	33-35	NDIS	No. of characters in text DISORDER description (record 3)
	36-38	NERR	No. of characters in text ERROR description (record 3)
	39-41	NOPR	No. of symmetry positions (record 4)
	42-44	NRAD	No. of element types in RADIUS record (record 5)
	45-47	NAT	No. of ATOMs (record 6)
	48-50	NSAT	No. of SATOMs (symmetry-generated atoms) (record 6)
	51-53	NBND	Not used.
	54-56	NCON	No. of crystallographic connectivity integers (record 7)
	57	CELL	Absence (0) or presence (1) of unit cell record (record 2)
	58	INTF	Intensity data measurement flag
							
				0	=	unknown
				1	=	visual
				2	=	densitometer
				3	=	diffractometer

	59	ATFOR	Atom coordinate format type

				0	=	no atoms present
				2	=	atoms present with format coordinates x 105

	60	CENT	Centre of symmetry at origin	
						
				0	=	unknown (record 4 absent)
				1	=	centre at origin
				2	=	no centre at origin

	61	ERR	Error status flag

				0	=	no errors
				1	=	error(s) in entry still unresolved (about 1.8% of CSD)

	62	RPA	Refer problem to author

				0	=	no referral
				1	=	problem referred back to author

	63	TD	Set to 1 if entry is disordered to extent where all coordinates 
			have been removed.
	64	PD	Set to 1 if entry is disordered, and a few coordinates may have
			been removed or suppressed.
	65	NU	Not used.
	66	CBL	Set to 1 for presence in paper of bond lengths corrected for 
			thermal motion.
	67	AS	Average esd. of C-C bonds,	

				0	=	unknown
				1	=	0.001-0.005Å
				2	=	0.006-0.010Å
				3	=	0.011-0.030Å
				4	=	0.031Å+

	68	POL	Set to 1 for polymeric structure.
	69-78		The remaining 10(I1) fields are either unused (=0), or relate to CSD 
			processing. They are of little interest to users.
	79-80	YEAR	Year of publication (final two digits of 19[nn]).

Record Type 2: Unit Cell Parameters

The format of the record is: (6I6,6I1,6I2,2I3,I3,A8,I3,I2,4X)

The following items are recorded:

Cols.	1-36	CELLD	(6I6)	Cell parameters a, b, c, alpha, beta, gamma, all multiplied 
				by 10Pn, where the Pn are precision digits for 
				a.... gamma, n = 1 ..... 6.

	37-42	PREC	(6I1)	Precision digits P1 (for a) .... P6 (for gamma).

	43-54	CESD	(6I2)	Cell parameter esd's multiplied by 10Pn, as above.

	55-60	DENS	(2I3)	100 x Dm, 100 x Dx.

	61-63	NSPG	(I3)	Space group number or aspect number.

	64-71	SPG	(A8)	Space group symbol or aspect symbol.

	72-74	Z	(I3)	Z-value, i.e. no. of formula units per cell.

	75-76	ITOL	(I2)	100 x tolerance value (usually 0.4).

Record Type 3: Text Information

The textual information is divided into 4 fields and the division into fields of continuous text is accomplished using the directory elements NRFAC, NREM, NDIS, NERR, in record 1.

Record Type 4: Symmetry Positions

The format of the record is: (5(3I1,I2,3I1,I2,3I1,I2),5X)

This record carries the general equivalent positions of the space group, except those related by a centre of symmetry. Lattice centring operations are included.

The record is present only if atomic coordinates and space group are recorded.

The position x(p), y(p), z(p) is generated from x, y, z by the matrix transformation:

[X(p)] = [Rij] . [X] + [T]

Here Rij is a 3x3 rotation matrix, and T is a 3x1 translation vector. For each position the order of information (repeated for subsequent positions) is:

R11 R12 R13 T1 R21 R22 R23 T2 R31 R32 R33 T3

Record Type 5: Radius Values

The format of the record is: (16(A2,I3))

The element symbol (A2) and 100 x R(el) are recorded.

These are the radii actually used to prepare the CSD entry. In general these will correspond to standard values, unless bonding situations have been encountered which require a change in the standard value.

Record Type 6: Atomic Coordinates

The format of the records is: (A5,3I7,1X,A5,3I7,1X,A5,3I7)

Each atom has a left-adjusted atom label [A5]. This always begins with a valid element symbol, which is then followed by at least one digit. Succeeding characters may be numbers and/or prime characters (').

For symmetry-related atoms, a letter is added to the symbol for the basic atom [C11 would generate C11A etc.]. The symbol A refers to the second symmetry position, B the third, etc.

The coordinates x, y, z then follow. They are multiplied by 105.

Any symmetry-related atoms (SATOM) are encoded in records of type 6, and follow on continuously from the asymmetric unit atom coordinates (ATOM). The number of lines of atomic coordinates to be expected can be deduced from NAT + NSAT in the directory record (record 1).

Record Type 7: Crystallographic Connection Table

The format may take one of the following forms:

In each case the record contains a sequence of integers. Their ordering and meaning are illustrated by the following example:

Consider the 12-atom structure:

The atom numbers are the sequence numbers in the total list of atoms and symmetry-related atoms (record 6).

The connectivity matrix for the above structure could be represented as follows:

	FROM		TO

1 10 2 3 4 8 9 3 2 4 2 5 12 5 4 6 6 5 7 9 7 6 8 8 2 7 9 2 6 10 11 10 9 1 11 9 12 4

This is a redundant table since each connection occurs twice.

We can code the connections of the first column of the TO list by implying the FROM atom. Thus:

The bond 3-0 is a "dummy" bond; atom 3 has only one connection and it has already been specified, viz. 2-3.

Two bonds have not been accounted for viz. 6-9 and 7-8. We code these by simply adding these 2 pairs of integers to the above sequence. Thus the full sequence of integers is: 10 3 0 2 4 5 6 2 2 9 9 4 6 9 7 8

In general, for an N-atom structure (atoms and symmetry-related atoms) interpret the first N integers as having implied FROM atoms and treat any further integers as FROM-TO pairs.

Example 1. Reference code: AABHTZ

#AABHTZ  13770506       18  9  0  0  0  1  5 35  0  0 37132100000000000000000076
 11372 10272  7359 10875  7107  9616333222 9 5 9 6 4 8  0  0  2P-1       240
R=0.0410
211 0121 0112 0
C  68H  23O  68N  68CL 99
CL1   -33550   9980  10610 CL2   -64070 -30840  32700 C1    -47750   3880  23070
C2    -57270  13370  34240 C3    -68450   9160  44830 C4    -70210  -4450  44470
C5    -60690 -13870  32950 C6    -49040 -10070  21810 C7    -38440 -19380   8860
N8    -36680 -29770  13510 N9    -26290 -38000    990 C10   -18570 -36530 -17520
N11   -21960 -38590 -33470 N12   -11480 -36040 -48430 C13    -2540 -32560 -40600
N14    -6420 -32680 -21040 N15      230 -28370  -7410 C16    -3050 -16060   7440
C17     4050 -12440  22150 O18   -11020  -8830   7930 C19   -23940 -48600   7180
C20   -33070 -50930  25480 O21   -14540 -55420  -2960 H2    -55800  23200  34700
H3    -75200  15700  53100 H4    -78400  -7800  53000 H7    -32600 -17500  -3200
H13     5700 -29600 -46900 H15     4600 -34000  -5700 H171    8100  -3600  21700
H172   10500 -19800  18900 H173    -600 -10700  33300 H201  -31300 -59800  27500
H202  -32900 -45100  37400 H203  -41300 -52500  24300
 3 7 4 5 6 7 8 3 8 910111213141216171818112121 4 5 6 915171919192222221516

Notes:

Record type 1 (Directory Information) corresponds to the following values of items:

SYS	1	CAT	3	ADAT	770506	NCARDS 	18	NRFAC	9
NREM	0	NDIS	0	NERR	0	NOPR	1	NRAD	5
NAT	35	NSAT	0	NBND	0	NCON	37	CELL	1
INTF	3	ATFOR	2	CENT	1	ERR	0	RPA	0
TD	0	PD	0	CBL	0	AS	0	POL	0
YEAR	76

Record type 2 (Unit Cell)

CELLD	11372	10272	7359	10875	7107	9616
PREC	333222
CESD	9	5	9	6	4	8
DENS	0  0	
NSPG	2	
SPG	P-1	
Z	2	
ITOL	40

Thus cell parameters are:

11.372(9)	10.272(5)	7.359(9)	108.75(6)	71.07(4)	96.16(8)

Record type 3 (Text Information)

R=0.0410 Note that in record 1 NRFAC=9 i.e. number of characters plus a blank

Record type 4 (Symmetry Positions)

One symmetry position 211 0121 0112 0

Record type 5 (Radius Values)

C 68H 23O 68N 68CL 99

This indicates radii (in Å) for 5 elements: C 0.68 H 0.23 O 0.68 N 0.68 CL 0.99

Record type 6 (Atomic Coordinates)

Coordinates for 35 atoms are recorded. Each fractional coordinate has been multiplied by 105.

Record type 7 (Crystallographic Connection Table)

37 crystallographic connectivity integers are recorded. Since NAT + NSAT < 100 these are in format (40I2).

Example 2. Reference code: AACFAZ10

#AACFAZ1033831006       27  9  0  0  0  4  5 29 29  0 64132100000000000000000083
 20132  6162 19954    90    90    90333000 4 1 3 0 0 0141142 60Pbcn      440
R=0.0720
211 0121 0112 0011 6121 6112 0211 0101 0112 6011 6101 6112 6
C  68H  23O  68N  68CL 99
CL1    47060 102730  33580 C1     41150  82580  34520 C2     37970  79280  40600
C3     33310  63010  41020 C4     31860  49900  35580 C5     35250  53030  29640
C6     39960  69530  29050 C7     39690  93120  46620 C8     39710  81720  53280
C9     44000  63090  54410 N10    48000  58940  49490 C11    43510  50330  60740
C12    35500  92050  57370 C13    32330 110060  53700 C14    29810  99400  67690
O1     34430  87220  63850 O2     28360 123460  55520 O3     34650 110000  47370
H3     30700  60000  45500 H4     28200  38400  35800 H5     34300  43000  25700
H6     42800  72500  24500 H7     44100 100900  45800 H111   45500  35900  60400
H112   44100  57700  65100 H113   39400  50000  62500 H141   29700  94100  72500
H142   30900 115100  68400 H143   25300 100900  65300 N10D   52000  41060  50510
C9D    56000  36910  45590 C8D    60290  18280  46720 C11D   56490  49670  39260
C7D    60310   6880  53380 C12D   64500   7950  42630 H111D  54500  64100  39600
H112D  55900  42300  34900 H113D  60600  50000  37500 C2D    62030  20720  59400
O3D    65350 -10000  52630 H7D    55900   -900  54200 C13D   67670 -10060  46300
O1D    65570  12780  36150 C1D    58850  17420  65480 C3D    66690  36990  58980
O2D    71640 -23460  44480 C14D   70190    600  32310 CL1D   52940  -2730  66420
C6D    60040  30470  70950 C4D    68140  50100  64420 H3D    69300  40000  54500
H141D  70300   5900  27500 H142D  69100 -15100  31600 H143D  74700   -900  34700
C5D    64750  46970  70360 H6D    57200  27500  75500 H4D    71800  61600  64200
H5D    65700  57000  74300
 2 3 4 5 6 7 2 3 8 91010 913161314 8 4 5 6 7 81212121515151130313132323333333434
343535393942434444454547474749495055141840425055

Notes:

Record type 4 (Symmetry Information)

Orthorhombic cell with space group Pbcn. A centre of symmetry at the origin is present (CENT = 1) so only 4 general positions are given (NOPR = 4):

	211 0121 0112 0	x,	y,	z
	011 6121 6112 0	1/2-x,	1/2+y,	z
	211 0101 0112 6	x,	-y,	1/2+z
	011 6101 6112 6	1/2-x,	1/2-y,	1/2+z

The remaining 4 operators are generated by applying the centre of symmetry operator to this set of 4, giving a total of 8 operators.

Record type 6 (Atomic Coordinates)

The total number of atoms is 58, consisting of NAT (=29) reference atoms plus NSAT (=29) symmetry-related atoms. The latter begin with label N10D and were generated with symmetry operator 5 in the expanded list, i.e. symmetry operator -x, -y, -z.

COOR format specification

Output format as generated by the OUTPUT COORD instruction in QUEST and GSTAT. Two formats are possible, one for output of fractional coordinates and the second for output of orthogonal Cartesian coordinates referred to any origin or axial system.

These formats are output from RPluto using the Coor menu button, and read as free-format input.

(a) Fractional coordinates (4 record types)

HEADER:	FORMAT (A8,'**FRAG**',I8)
CELL:	FORMAT ('CELL',4X,6F8.3)
SYMM:	FORMAT ('SYMM',4X,3(F4.0,1X,F7.5,1X))
ATOM:	FORMAT (A6,4X,3F10.5)
Example:

CORAMA  **FRAG**       1
CELL      11.858  13.928   5.572  90.000  90.000  90.000
SYMM      1.  0.  0. 0.00000   0.  1.  0. 0.00000   0.  0.  1. 0.00000
SYMM      1.  0.  0. 0.50000   0. -1.  0. 0.50000   0.  0. -1. 0.00000
SYMM     -1.  0.  0. 0.00000   0.  1.  0. 0.50000   0.  0. -1. 0.50000
SYMM     -1.  0.  0. 0.50000   0. -1.  0. 0.00000   0.  0.  1. 0.50000
C1          -0.07640   0.09050   0.11980
C2          -0.04630  -0.01420   0.17590
C3          -0.13990   0.00880   0.00680
C6           0.00140   0.14710  -0.03890
O1           0.02150   0.12540  -0.24600

(b) Orthogonal coordinates (2 record types)

HEADER:	FORMAT (A8,'**FRAG**',I8)
ATOM:	FORMAT (A6,4X,3F10.5)

Example:

CORAMA  **FRAG**       1
C1          -0.90595   1.26048   0.66753
C2          -0.54903  -0.19778   0.98011
C3          -1.65893   0.12257   0.03789
C6           0.01660   2.04881  -0.21675
O1           0.25495   1.74657  -1.37071

For output of coordinates for complete molecules, i.e. there is no FRAG packet in the instruction set, then the fragment number in the header record is always zero. This applies to all Coor files output from RPluto.

FREE-format specification

RPluto has retained its original feature of accepting data typed in a free-format, for crystal structures not in the CSD. A file of data entries may be typed in using a text editor and items are selected by entry number as for an FDAT file. Each data entry begins with a TITLE line and finishes with an END line:

	TITLE  New steroid complex 
	CELL   12.341  7.501  25.23  90  105.3  90
	SYMM   x,y,z   *   1/2-x, 1/2+y, 1/2-z
	C1  0.1234  0.5678  -.6543
	. . .
	END
This option includes non- crystallographic data, where there is no unit cell to enter, and the coordinates are expected to orthogonal in Å.

Free-format keywords

ATOM

Function: Input an atom site coordinate.

Format: [ATOM] atlab x y z

atlab : an atom label, e.g. C12, Br1

x/y/z : x/y/z-coordinate (fractional or orthogonal)

The ATOM keyword is optional and can therefore be omitted. The program recognises atom cards by checking that the first item is a possible atom label (see below for syntax) and exactly 3 numbers follow. There are no restrictions on the spacing between the numbers, but decimal points should be given.

The atom label must be given as an element symbol followed by a number. The maximum number of characters allowed in a label is 6. Some typical examples are C1, Br20, H121'.

Examples:

C1 .1234 -.4567 1.2341

Fe1 0.0 0.0 0.0

CELL

Function: Enter cell data.

Format: CELL a b c alpha beta gamma

a : cell axial length a in Å

b : cell axial length b in Å

c : cell axial length c in Å

alpha : cell angle alpha in degrees

beta : cell angle beta in degrees

gamma : cell angle gamma in degrees

The program expects the command CELL to be followed by exactly 6 numbers, with no restrictions on the spacing.

Example:

CELL 12.123 5.345 17.23 90 106.7 90

Note that orthogonal coordinates may be input by giving a cell with dimensions 1 1 1 90 90 90. If no CELL command is given in a free-format data entry the program will assume that orthogonal coordinates follow.

SYMM

Function: Input a symmetry operator as in International Tables for Crystallography.

Format:

SYMM oper1 [*] [oper2] ...

oper1 : first operator coded in style x, y, z

oper2 : optional second operator

Alternative format:

SYMM r11 r12 r13 t1 r21 r22 r23 t2 r31 r32 r33

The complete set of general position operators should be given for the space group. It is important to ensure that the first operator is the identity operator x, y, z.

The operators may be given one at a time on SYMM command lines, or combined on to a single line with the * character as a separator.

Example: for the space group P 21/c

SYMM x,y,z * x,1/2-y,1/2+z * -x,-y,-z * -x,1/2+y,1/2-z

In the alternative coding method, symmetry operators are given in the form of rotation and translation matrices which perform the transformations, as in the COOR file:

x'=r11x +r12y +r13z +t1; y'=r21x +r22y +r23z +t2; z'=31x +32y +33z +t3

Example: the following line represents the operator x, -y-1/2, z-1/2 :

SYMM 1. 0. 0. 0.00000 0. -1. 0. -.50000 0. 0. 1. -.50000

i.e. the operator X' = RX + t where R is a rotation matrix:

1.  0.  0. 
0. -1.  0.
0.  0.  1. 

and t a translation matrix:

(0, -0.5, -0.5)

Example: for the space group P212121

SYMM      1.  0.  0. 0.00000   0.  1.  0. 0.00000   0.  0.  1. 0.00000
SYMM      1.  0.  0. 0.50000   0. -1.  0. 0.50000   0.  0. -1. 0.00000
SYMM     -1.  0.  0. 0.00000   0.  1.  0. 0.50000   0.  0. -1. 0.50000
SYMM     -1.  0.  0. 0.50000   0. -1.  0. 0.00000   0.  0.  1. 0.50000

If no SYMM command is given, then the program will only use the identity operator and will be unable to perform a proper calculation of intermolecular distances, unless the SPAC keyword is present. Intramolecular distances will be calculated correctly, provided that atoms are already positioned in the same asymmetric crystallographic unit.

SPACe group

Function: Input standard space group symbol

Format:

SPAC symbol

symbol : space group text symbol

Example: SPAC P21/c

The space group symbol is not used in RPluto unless there are no SYMM lines. In this case, the standard operators for the first setting of the space group stored in the RPluto symmetry table file pluto.sym are used. The setting may be changed within RPluto using the SWI command.

JOIN

Function: Define explicit bonds between two atoms, and prevent connectivity from being built automatically.

Format:

JOIN at1 at2 at3 ... [* at10 at11 at12]

JOIN NONE

an: atom label: these labels define connected chains of atoms

[*]: denotes break between connected chains.

NONE: define no connections between atoms.

The default action on reading free-format input is to calculate the intramolecular connectivity (Calc) using standard radii. The JOIN command overrides this action, and allows bonds to be defined explicitly.

TITLE

Function: Provide a plot title for display on the diagram.

Format: TITLE [text]

[text] : optional string of text, up to 75 characters.

If a TITLE command is given with no text following the title will be set to blank on the plot. In the input of free-format data, the TITLE command acts as a signal for the start of each entry and is mandatory.

Example: TITLE Steroid number 123

END

Function: Signal the end of a free-format entry.

Format: END

The END command is used to separate multiple entries in free-format input files and is mandatory.