SDPD Internet Course


Week 8

Structure solution by Patterson, direct or molecule location methods, part 3 :
Molecule location.


Lectures

Chemists may be able to know which molecule is in a crystalline cell by NMR of a sample in a solution, for instance, or because they know what they added to an already known molecule during a new synthesis. Then, knowing the cell parameters and space group, the structure determination may consist in locating the molecule or known fragment position inside the cell. Possibly, there could be heavy additives (Cl or S atoms) or small molecules (like water, etc) to the molecule which have to be simultaneously located, if the known fragment is not enough representative in percentage (80% may not lead to R factors below 30%) of the cell content. Many programs were built that realize rotations and translations of the model in the cell, up to find its correct position. When dealing with powder data, the problem is complicated due to reflections overlapping. Beside the brute force using a systematic grid-search approach, more elegant and efficient methods apply Monte Carlo/simulated annealing and also genetic algorithms in order to attain the best molecule position. The most elaborated programs may cope with several fragments simultaneously, and also explore torsion angles which may differ from the starting model.

Many conferences were given on that subject at IUCr XVIII, Glasgow, 1999. However, none is online. By courtesy of Yuri Andreev, you may have access to his very recent powerpoint material corresponding to his EPDIC8 (Uppsala, Sweden, 2002) conference (.html, (CD) or the zipped PPT .zip - big file > 6Mo (CD)). You will find few recent publications on this fast evolving subject. Read some of the review papers on that topic (see the SDPD-Database at the review papers page (CD)).

Several chapters deal with these methods in the recent book : 
B7- Structure Determination from Powder Diffraction Data 
        W.I.F. David, K. Shankland, L.B. McCusker, and Ch Baerlocher, 
        IUCr Monographs on Crystallography, Vol 13, 
        Oxford Science Publications, 2002. 

You may find some references also by using the SDPD database search options :

Search among experimental cases for a reference

Use keywords like : Monte, Carlo, simulated, annealing, model building, MB, genetic, algorithm, as well as some program names like: patsee, dirdif, rotsearch, octopus, dash, druid, powdersolve, etc.

You should distinguish programs which were especially built for dealing with powder patterns (able to take account of the problem of overlapping reflections), from those programs built for single crystal data (DIRDIF, PATSEE for small molecules). There are essentially 3 ways to deal really with powder data overlapping in the programs :

  • The powder pattern itself (at least a part of it) is calculated for each tested position of the model inside the cell, and compared to the raw data. This is the method retained in the OCTOPUS program, for instance. There is no need to extract the structure factors. Reconstituting the raw data may be consuming in computer time. Note however that a "|Fobs|" extraction is necessarily done at the beginning in order to determine the best profile and cell parameters which will be then fixed and used further at the powder pattern calculation stage.
  • The extracted "|Fobs|" are used, but a pseudo pattern is regenerated from them and compared to a pseudo pattern generated from the |Fcalc|. This allows saving time since no background, lorentz-polarisation, asymmetry (etc) has to be calculated. This method is in used in the ESPOIR program exclusively, up to now.
  • A fitness function is defined, including the extracted "|Fobs|" and calculated structure factors. This function is used to decide which molecule will "survive". This is the method built in the DRUID/DASH program sponsored by the CCDC (now released), and also in PSSP.


That method for determining crystal structures is also used for large molecules, from single crystal diffraction data. Have a look at the "Molecular Replacement" methods which are selected for determining protrein structures (for instance the AMORE, MOLREP, MRX, MODELLER ... programs - see some CCP4 Web pages). These programs carry out rotation and translation searches for locating a molecule in a cell, and also produce rigid body refinement. Think also about "Isomorphous Replacement" (MIR program).


Software to download

By now, you should already have Espoir, EXPO  and Dirdif or WinDirdif (for single crystals) which were already mentioned previous weeks. They also have options for molecule location.

Download as well, if you wish :

PSSP by P. Stephens (CD).

FOX by Vicent Favre Nicollin, open source (CD). 

PATSEE by E. Egert and G.M. Sheldrick, open source (CD : PDF paper).

Unfortunately, the list of commercial or unavailable programs is long below :

OCTOPUS by M. Tremayne et al.,

DASH by W.I.F David and K. Shankland (commercial, distributed by CCDC),

PowderSolve commercial by Accelrys,

ROTSEARCH by Jordi Rius,

Etc, see the list of programs (CD) in the SDPD database.

Also available for structure prediction, working by packing optimization :

PROMET by Angelo Gavezzotti,

HARDPACK by Rainer Rudert.

etc.

We are now not far from molecular modelling methods for optimization of models by semi-empirical or ab initio approaches. In fact, the ratio of the number of reflections and the number of parameters to be refined may be so small, when studying large molecules, that restraints and constraints have to be used. Molecular modelling may help to verify if all is correct, and may help to place hydrogen atoms, but this is another story (see the CCL Mailing list).


Exercise

A new pyrene (C16H10) derivative is synthesized with C17H12O formulation. A synchrotron pattern is recorded (wavelength 0.79764 A). The pattern quality is not that good, due to a very small sample quantity. Propose a cell, a space group, extract the "|Fobs|", and finally, locate the pyrene molecule. For convenience, the data were re-formatted in .uxd and .raw Bruker files. Careful calibration allows to be sure that the zeropoint is less than |0.002|°(2-theta). A pyrene fragment that may help you to solve the problem is given here, built in Cartesian coordinate in (pyrene.txt). Try to find what is linked to the pyrene molecule in that structure by Fourier difference syntheses. Do not make any structure refinement by the Rietveld method at this stage.

Give  the beginning of the .smh CRYSFIRE file, the space group, your final atomic coordinates, the reliability factors.


Main software selected for the correction

POWDERX for estimating the reflection positions. CRYSFIRE for indexing. FULLPROF for extracting "|Fobs|"; OVERLAP for building reduced data set; SHELXS-97 for trying direct methods and ESPOIR for locating the pyrene molecule. ORTEP-3 for Windows for drawing and viewing. SHELXL-97 for refining and Fourier difference syntheses, etc.


The next week will be reserved for the structure completion and refinement by the Rietveld method, part 1.
Good luck !