Recommended procedures: Difference between revisions

Revision as of 17:26, 25 January 2013

The following is what we typically do for each new data set for which we have a decent initial model.

(If you don't have an initial model: perform RCT, tomography+sub-tomogram averaging, or (if you really need to) common-lines procedures in a different program).

Getting organised

Save all your micrographs in one or more subdirectories of the project directory (from where you'll launch the RELION GUI). We like to call these directories "Micrographs/" if all micrographs are in one directory, or "Micrographs_15jan13/" and "Micrographs_23jan13/" if they are in different directories (e.g. because they were collected on different dates). If you for some reason do not want to place your micrographs inside the RELIOn project directory, then inside the project directory you can also make a symbolic link to the directory where your micrographs are stored.

Particle selection & preprocessing

Our favourites are Ximdisp and e2boxer.py. Be careful at this stage: you are probably better at getting rid of bad/junk particles than any of the classification procedures below! So spend a decent amount of time on selecting good particles, be it manually or (semi-)automatically.

After picking the particles, we use the preprocessing run-type on the GUI to extract, normalize and invert contrast (if necessary to get white particles). The radius around the particles for normalization should be chosen to be slightly larger than the actual particles to account for suboptimal centering at this stage.

2D class averaging

We like to use 3D class averaging to get rid of bad/junk particles in the data set. Apart from choosing a suitable particle diameter (make sure you don't cutt off any real signal, but try to minimise the noise around your particle as well), the most important parameters are the number of classes (K) and the regularization parameter T. For cryo-EM we typically have at least 150-250 particles per class, so with 3,000 particles we would not use more than K=20 classes. Also, to limit computational costs, we rarely use more than say 150 classes even for large data sets. For negative stain, one can use fewer particles per class, say at least 50-100. For cryo-EM, we typically use T=2; while for negative stain we use values of 1-2. We typically do not touch the default sampling parameters.

Most 2D class averaging runs yield some classes that are highly populated (look for the data_model_classes table in the model.star files for class occupancies) and these classes typically show nice, relative high-resolution views of your complex in different orientations. Besides these good classes, there are often also many bad classes: these are typically bad/junk particles. Because junk particles do not average well together there are often few particles in each bad class, and the resolution of the corresponding class average is thus very low. These classes will look very ugly! We then use awk (see the [[FAQs#How_can_I_select_images_from_a_STAR_file.3F | FAQs page] to make a smaller STAR file, from which all the bad classes are excluded. The reasoning behind this is that if particles do not average well with the others in 2D class averaging, they will also cause trouble in 3D refinement.

Depending on how clean our data is, we some times repeat this process 2 or 3 times. Be patient, as 2D class averaging is remarkably slow in RELION... However, having a clean data set is an important factor in getting good 3D classification results.

3D classification

Once we're happy with our data cleaning in 2D, we almost always perform 3D classification. Remember: ALL data sets are heterogeneous! It is therefore always worth checking to what extent this is the case in your data set. At stage stage we use our initial model for the first time. Remember, if it is not reconstructed from the same data set in RELION or XMIPP, it is probably NOT on the correct grey scale. Also, if it is not reconstructed with CTF correction in RELION or it is not made from a PDB file, then one should probably also set "Has reference been CTF corrected?" to No. We prefer to start from relatively harsh initial low-pass filters (often 40-60 Angstrom), and typically perform 25 iterations with a regularization factor T=4 for cryo-EM; and T=2-4 for negative stain. (But remember: classifying stain is often a pain due to variations in staining.) For cryo-EM, we prefer to have at least (on average) 5,000-10,000 particles per class. For negative stain, fewer particles per class may be used. We typically do not touch the default sampling parameters, except perhaps for icosahedral viruses where we use finer angular samplings.

After classification, we use the same awk command as above to generate separate STAR files for each structural state of interest. Similarly-looking classes may be considered as one structural state at this point. Difference maps (after alignment of the maps in for example Chimera) are a useful tool to decide whether two maps are similar or not. In some cases, most often with large data sets, one may choose to further classify separate classes in an additional classification run.

3D refinement

The 3D classes of interest are each refined separately using the 3D-auto-refine procedure. We often use the refined map of the corresponding class as the initial model (or sometimes the original initial model) and we start refinement again from a rather harsh initial low-pass filter, often 40-60 Angstroms. We typically do not touch the default sampling parameters, except for icosahedral viruses where we may start from 3.7 degrees angular sampling and we perform local searches from 0.9 degrees onwards. After 3D refinement, we sharpen the map based on the unfiltered maps that are written out from release 1.2 onwards, as explained on the Analyse results page.

Afterwards

If this is useful for you, please cite RELION (either Scheres (2012) JMB or Scheres (2012) JSB). The relevance of the 0.143 criterion for the gold-standard FSCs used in RELION is described in Scheres & Chen (2012) Nat. Meth.

@@ Line 7: / Line 7: @@
 Save all your micrographs in one or more subdirectories of the project directory (from where you'll launch the RELION GUI). We like to call these directories "Micrographs/" if all micrographs are in one directory, or "Micrographs_15jan13/" and "Micrographs_23jan13/" if they are in different directories (e.g. because they were collected on different dates). If you for some reason do not want to place your micrographs inside the RELIOn project directory, then inside the project directory you can also make a symbolic link to the directory where your micrographs are stored.
-== Particle selection ==
+== Particle selection & preprocessing==
-Our favourites are Ximdisp and e2boxer.py. Be careful at this stage: you are probably better at getting rid of bad/junk particles than any of the classification procedures below! So spend a decent amount of time on selecting good particles, be it manually or (semi-)automatically.
+Our favourites are [http://www2.mrc-lmb.cam.ac.uk/research/locally-developed-software/image-processing-software/ Ximdisp] and [http://blake.bcm.edu/emanwiki/EMAN2/Programs/e2boxer e2boxer.py]. Be careful at this stage: you are probably better at getting rid of bad/junk particles than any of the classification procedures below! So spend a decent amount of time on selecting good particles, be it manually or (semi-)automatically.
+After picking the particles, we use the preprocessing run-type on the GUI to extract, normalize and invert contrast (if necessary to get white particles). The radius around the particles for normalization should be chosen to be slightly larger than the actual particles to account for suboptimal centering at this stage.
 == 2D class averaging ==