Powder Diffraction Indexing Benchmarks

Bethanechol chloride benchmarks

The two ICDD PDF entries 43-1748 and 46-1964 are used (see UPPW11 and UPPW11-2) for 8 tests of these benchmarks.
(in all cases using the 20 first lines, or the 20 first lines with I  >5%(I/Imax) :

A- indexing the raw data. A(1) for ICDD PDF entry 43-1748 and A(2) for the 46-1964.
No permission to find the zeropoint except with an internal system.
B- indexing the data with I >5% (I/Imax). B(1) and B(2) as above.
Most experienced powder diffractionist try to index on data obtained after removing the small intensity peaks.
C- indexing the data corrected from zeropoint (-0.10°(2-theta)). C(1) and C(2) as above.
D- indexing the data corrected from the zeropoint and on selected peaks with I  > 5% (I/Imax). D(1) and D(2) as above.

To these tests are added two more ones, much easier :

E indexing new high quality conventional X-ray data
F- indexing synchrotron data (X3B1 beamline - thanks to Peter Stephens).

The indexing programs should be applied to the following series of 10 datasets in two modes :
- automated (using default values), and
- manual.

The total number of tests is thus equal to 20.

Indexing benchmarks data (couples of 2-theta and Intensity values) :

    A(1)        A(2)        B(1)        B(2)        C(1)        C(2)        D(1)       D(2)         E(3)       F(4)
2-thet   I  2-thet   I  2-thet   I  2-thet   I  2-thet   I  2-thet   I  2-thet   I  2-thet   I  2-thet   I  2-thet   I
6.238    2   6.690   2   6.712  10  13.601   5   6.138   2   6.590   2   6.612  10  13.501   5  10.765  58   4.887   6
6.712   10   9.417   1  13.171  10  14.757   6   6.612  10   9.317   1  13.071  10  14.657   6  13.522  27   6.139  33
9.403    3  10.849   1  13.584  16  15.492  29   9.303   3  10.749   1  13.484  16  15.392  29  14.690 122   6.664  26
13.171  10  13.135   2  14.882  10  16.463  24  13.071  10  13.035   2  14.782  10  16.363  24  15.398 101   6.989 101
13.584  16  13.601   5  15.498  40  17.419  20  13.484  16  13.501   5  15.398  40  17.319  20  16.336  65   7.403  56
14.483   4  14.757   6  16.528  37  18.925  58  14.383   4  14.657   6  16.428  37  18.825  58  16.453  91   7.460  79
14.882  10  15.492  29  17.430  34  19.730  11  14.782  10  15.392  29  17.330  34  19.630  11  17.312 106   7.850  82
15.498  40  16.463  24  18.928 100  20.131  63  15.398  40  16.363  24  18.828 100  20.031  63  18.828 405   8.531 225
16.528  37  17.419  20  19.808  12  20.841 100  16.428  37  17.319  20  19.708  12  20.741 100  19.699  60   8.921  30
17.430  34  18.925  58  20.148  60  22.508  23  17.330  34  18.825  58  20.048  60  22.408  23  20.031 500   9.062  47
18.534   4  19.730  11  20.852  90  23.224   6  18.434   4  19.630  11  20.752  90  23.124   6  20.752 500   9.251 180
18.928 100  20.131  63  22.539  22  23.614  13  18.828 100  20.031  63  22.439  22  23.514  13  21.641 141   9.386 304
19.808  12  20.841 100  23.680  13  23.979  20  19.708  12  20.741 100  23.580  13  23.879  20  22.414 173   9.781  11
20.148  60  21.722   3  23.982  18  25.085  56  20.048  60  21.622   3  23.882  18  24.985  56  22.788  34  10.136  93
20.852  90  22.508  23  25.088  56  25.650  12  29.752  90  22.408  23  24.988  56  25.550  12  23.152  40  10.300   8
21.852   2  23.224   6  25.844  20  26.796  15  21.752   2  23.124   6  25.744  20  26.696  15  23.528  72  10.468  21
22.331   3  23.614  13  26.556  12  27.407  75  22.231   3  23.514  13  26.456  12  27.307  75  23.872 354  10.634  28
22.539  22  23.979  20  27.412  78  28.262  19  22.439  22  23.879  20  27.312  78  28.162  19  24.970 327  10.783  62
23.252   4  25.085  56  28.272  24  29.724  19  23.152   4  24.985  56  28.172  24  29.624  19  25.365  34  11.280 160
23.680  13  25.650  12  29.786  14  31.220  20  23.580  13  25.550  12  29.686  14  31.120  20  25.549  47  11.450  16
(1) ICDD PDF entry 43-1748, use wavelength = 1.5418 Å
(2) ICDD PDF entry 46-1964, use wavelength = 1.5418 Å
(3) conventional X-ray data, use wavelength = 1.54056 Å
(4) synchrotron data, use wavelength = 0.6995 Å (thanks to Peter Stephens)
You may either download individual test files (click on the A(1), A(2) etc above and 
you will obtain a .txt ASCII file), or download the ten tests in benchmarks.zip.
Do not forget to apply these ten tests in both default and manual modes - if possible...
Comments :

Why 20 lines ? Because "There are few exceptions from the rule that, if all of the first 20 lines are indexed and M20 >10, the indexing is physically reliable."
See: Werner, P.-E.: Autoindexing. in: Structure Determination from powder diffraction data (Eds. W.I.F. David, K. Shank-land, L.B. McCusker, Ch. Baerlocher), p.118-135. Oxford Science Publications 2002.

Some indexing programs expect 20 lines to be the minimum.

There are 8 impurity lines among the first 26 lines in the PDF entry 43-1748, and 3 impurity lines among the first 35 lines in the PDF entry 46-1964. Moreover, both patterns have a surprising large zeropoint error close to -0.10(2-theta)°. The difficulty level is thus decreasing from tests A to F. The application with default values should be preferably done in all crystal symmetries, or at least with maximum cell parameters being 20Å and Vmax = 2000Å3 in monoclinic symmetry. These conditions correspond probably to more than 50% of the crystal structures stored in the ICSD and CSD databases). The application in manual mode should be restricted to a monoclinic search in the 800-1200 Å3 volume range, and 5-20 Å cell parameters.

If a program expects to be used successfully by unexperienced users, it is clear that it should offer an automated/default/black-box mode.

The best FoM was reported from the use of the synchrotron data : M(20) = 197,  F20 = 1080 (0.0006, 32), the cell being monoclinic with a = 8.875 (Å), b = 16.408 (Å), c = 7.137 (Å), and beta = 93.84 (°), V = 1036.9 (Å3), space group P21/n.

Notation system

Answers to these benchmarks are to be given in a simple way. Send the input and output files. Notation is either -1 or 0 or 1 :

The 1 point note to the A, B, C, D, E or F tests means that the correct cell was found in first FoM position among the proposals. Such a 1 point note means that the program can produce sometimes good results in unexperienced hands. A 1V note means that the correct solution is found in first position in a list of cells sorted by increasing volume.

The zero point note means that the correct cell is mixed with uncorrect ones, not at the head of the list. In that case, an expert may have still chances to locate the correct cell among the garbage, if it is listed among the first ten, but this requires much more additional work. And sometimes, the correct cell is only listed among the first 100 or 1000...! The order of the true solution in the list is given as a subscript : 06 means that it was the sixth cell proposal. But the order has to be < 10, otherwise, there is a -1 point note given.

The -1 point note means that the correct cell was not found at all, or at a position larger than 10 in the lists.

The results are reported below. A problem however is that programs using the raw powder diffraction pattern (like EFLECH/INDEX, GAIN) cannot be tested against these benchmarks.

Though there is no flat-cell problem in these selected benchmarks, some program may encounter also difficulties due to the fact that the bethanechol chloride b parameter is quite larger than a and c.

The global note is obtained by adding the 20 results (1, 0 or -1), so that it will be in the range -20 to +20. Having a global note > 0 is desirable.

Note that if you can obtain a better result with a given indexing program, you are requested to send (alb@cristal.org) the data and output files in order to update these results - thanks !

You may also propose other benchmarks.

Hyperlinks are given (click on the note 0 or +1 or +1V, no link for -1) to text files containing 
the operator name, the input data file and the output file.
def = default run (or automated mode, or black-box mode, whatever); 
man = manual run (the user selects more appropriate conditions than the default ones).
             A(1)    A(2)    B(1)    B(2)    C(1)    C(2)    D(1)    D(2)    E(3)    F(4)   Note    Global  Time for
Program    def-man def-man def-man def-man def-man def-man def-man def-man def-man def-man def-man   note   one test
ITO13      -1  -1  -1  -1  -1  -1  -1  -1  -1  -1  -1  -1  -1  -1  -1  -1  -1  +1  +1  +1  -8  -6    -14    VF
DICVOL91   -1  -1  -1  -1  -1  -1  -1  +1  -1  -1  -1  -1  -1  -1  -1  +1  +1  +1  +1  +1  -6  -2     -8     F
DICVOL04a  -1  -1  -1  -1  -1   02 -1  +1  -1  -1  -1  +1  -1  -1  -1  +1  +1  +1  +1  +1  -6  +1     -5     F
TREOR90    -1  -1  -1  -1  -1  -1  +1  +1  -1  -1  -1  -1  -1  -1  +1  +1  +1  +1  +1  +1  -2  -2     -4     F
DICVOL04b  -1  +1  -1  +1  -1  +1  -1  +1  -1   07 -1  +1  -1  +1  -1  +1  +1  +1  +1  +1  -6  +9     +3     F
McMaille   -1   06 +1V +1  +1V +1  -1  +1  -1  -1  -1  +1  -1  +1  -1  +1  +1  +1  +1  +1  -2  +7     +5    VS
Combine$   -1  +1  +1V +1  +1V +1  +1  +1  -1  07  -1  +1  -1  +1  +1  +1  +1  +1  +1  +1  +2  +9     +11
Combine£   -1  +1  +1V +1  +1V +1  +1  +1  -1  +1  -1  +1  -1  +1  +1  +1  +1  +1  +1  +1  +2 +10     +12
CRYSFIRE*  -1  +1  -1  +1  -1  +1  -1  +1  -1  +1  -1  +1  -1  +1  +1  +1  +1  +1  +1  +1  -4 +10     +6 VF to VS
(1) ICDD PDF entry 43-1748
(2) ICDD PDF entry 46-1964
A raw data
B weak intensity lines removed
C zeropoint corrected
D weak intensity lines removed and zeropoint corrected
E(3) high resolution new conventional X-ray data
F(4) Synchrotron data
DICVOL04a : fast default and manual tests made by A. Le Bail - without enough experience...
DICVOL04b : manual tests made by D. Louër
($) Combining the individual best results of ITO13, DICVOL91, DICVOL04, TREOR90 and McMaille
(£) Combining all best results
(*) for CRYSFIRE, this is the summary of the best results. See Zkbm04.html for details.
    See at the bottom-left of this zkbm04.html page links to details for each test.
    Note that the benchmark rule (not modifying the data) was broken sometimes (for
    instance TAUP appears to be able to solve the C1 case in manual mode, but this was
    obtained from a selection of the 10 most intense lines instead of the 20 in the full
    set. So, the global note for CRYSFIRE would be different if only the results obtained
    without modification of the benchmark data were considered (see the discussion at the
    SDPD Mailing list).
Speed : VF = Very Fast (< second) 
        F  = Fast (< few seconds)
        S  = Slow (< minutes or few minutes) 
        VS = Very Slow (< hour or few hours)
Special conditions of the benchmark runs with the various programs :

Default mode: typical conditions:
Manual mode: typical conditions:
 CEM=20.0,D2=0.0006,D1=0.0003,                       (    or D2=0.0002,D1=0.0001  for the synchrotron data  )
Comment: local TREOR90 version, the last line is a zeropoint.

Default mode: typical conditions:
all default values (blank line except for the wavelenght and zeropoint when known):
Manual mode: typical conditions:
9009 1 1 1             0.6995                                         3.00       (TOLG tried with values 1, 2, 3, 4, 5, and 6)
unindexed tolerated lines : 10
Comment: the successfull test E(2) in manual mode is obtained only with TOLG=1.00, not with 2,3,4,5,6 !

Default mode: typical first 5 lines in the entry data :
ICDD 43-1748 Benchmark A-1
20 2 1 1 1 1 1 0
0. 0. 0. 0. 0. 0. 0.
1.5418 0. 0. 0.
0. 0. 0.
Manual mode: typical first 5 lines in the entry data :
ICDD 43-1748 Benchmark A-1
20 2 0 0 0 0 1 0
20. 20. 20. 800. 1200. 90. 120.
1.5418 0. 0. 0.
0.05 0. 0.
Comment: The peak position error tolerance is increased to 0.05 instead of the default value, exploration is made only in monoclinic in a restricted cell volume range. That version has no tolerance at all for impurity lines, this explains the negative global note.

Default mode: typical first 5 lines in the entry data :
ICDD 43-1748 Benchmark A-1
20 2 1 1 1 1 1 0
0. 0. 0. 0. 0. 0. 0.
1.5418 0. 0. 0.
0. 0. 0 0 1
Manual mode: typical first 5 lines in the entry data (2 tests made with different line 5 : with or without impurity line tolerance and zero search):
ICDD 43-1748 Benchmark A-1
20 2 0 0 0 0 1 0
20. 20. 20. 800. 1200. 90. 120.
1.5418 0. 0. 0.
0.05 0. 0 0 1     or     0.05 0. -8 1 1   or    0.035 0. 4 0 1   or 0.010 0. 5 0 1 etc
Comment: The peak position error tolerance is increased to 0.035 to 0.050 instead of the default value (or reduced to 0.010), exploration is made only in monoclinic in a restricted cell volume range. That version has tolerance for impurity lines, and may search for a zeropoint.

Default mode: 3 unindexed lines maximum are tolerated (yes it was only 2 in he previous version of McMaille, but benchmarks are made for improving software, don't you think so ?). All symmetries examined (in fact stopped before examining triclinic which was not examined with TREOR and DICVOL as well). Maximum cell parameters and volumes are respectively 20Å, 2000 Å3 (monoclinic) and 1000 Å3 (triclinic). No internal system able to find the zeropoint. Typical data preceeding the list of 2-theta and I values is reduced to the 2 following lines (first line : title; second line : the wavelength, the zeroshift and the code=3 corresponding to the automated black-box mode) :
ICDD 46-1964  Indexing Benchmark A-2
1.5418 0.000 3
Manual mode: 8 unindexed lines tolerated, successive runs, the solution being found in a monoclinic run with volume range 800-1200 Å3, and 5-20 Å cell parameters, W = either 0.3 or 0.5 (line Width). Typical conditions for a run in manual mode :
ICDD 43-1748 Benchmark A-1     Manual mode
! Wavelength, zeropoint, Ngrid
 1.541800 0.0000 0
! Codes for symmetry
 0 0 0 0 1 0
! W, Nind
0.300  8
!Pmin, Pmax, Vmin, Vmax, Rmin, Rmax, Rmaxref
  5. 20. 800. 1200. 0.02 0.25 0.50
! Ntests, Nruns
 2000000 20
!  2-theta   Intensity
   6.238000       2.000000
   6.712000       10.00000
Comment: why not using these manual mode conditions as the default mode ? Because the calculation time is really much longer ! Problem with Monte Carlo programs : different runs may not give the results in the same order. You should possibly also make several runs for obtaining similar results... As soon as computer speed will increase by a factor 1000 (will take the next 15 years ??), this McMaille program will be more comfortable ;-)

Default mode : Crysfire's "all default" route provides defaults for the various indexing programs as described in its user guide. Typically Vmax is 6000 A3 and, where applicable, the number of unindexed lines is set to zero.
Manual mode: Interventions were limited to those available from Crysfire's interactive commands and menus. EDit and Strip were used to exclude the 10 and 4 weakest lines from A1 and A2 respectively. Where applicable, Vmax =1200 and unindexed line = 1. D2Theta was set to 0.05 or 0.06, if not already the default. The basis sets for LOSH, Mmap and Hmap used in datasets D2, E and F (already indexed by default runs of KOHL, DICVOL, etc) were taken from previous runs of LZON or KOHL.


As expected, with extremely good data (high accuracy, no impurity line, benchmarks E and F) most programs produce the solution. But when the data are presenting some problems, then the indexing programs are clearly not equals.

It remains to homogenize the default and manual modes according to which the benchmarks are applied with the individual programs ITO, TREOR, DICVOL91 and McMaille (giving a score of +9 if combined) and when these programs are used in Crysfire (leading to a score of +6). The Crysfire defaults for these programs supersede their own defaults.

Created ; February 2004.
Updated : October 2004.