Prepare input files: Difference between revisions

From Relion
Jump to navigation Jump to search
Line 52: Line 52:
=== Important note for 3D classification ===
=== Important note for 3D classification ===


''Although for 3D classification an external reference may work OK (as illustrated in the fully guided 3D classification example), '''one often gets better results by starting classification from a consensus model that was generated from the structurally heterogeneous data set itself'''. To that purpose, one may refine (for multiple iterations) the external reference (as a single class) against the entire data set. The resulting model may then be used to generate the random seeds and classify the data using multiple classes.''  
''Although for 3D classification an external reference may work OK (as illustrated in the [[http://www2.mrc-lmb.cam.ac.uk/relion/index.php/Classification_example|fully guided 3D classification example]]), '''one often gets better results by starting classification from a consensus model that was generated from the structurally heterogeneous data set itself'''. To that purpose, one may refine (for multiple iterations) the external reference (as a single class) against the entire data set. The resulting model may then be used to generate the random seeds and classify the data using multiple classes.''  


Note that the option <code>Ref. map is on absolute greyscale?</code> in the GUI should then probably be set to <code>No</code> for the initial single-reference refinement, and then to <code>Yes</code> for the actual classification run.
Note that the option <code>Ref. map is on absolute greyscale?</code> in the GUI should then probably be set to <code>No</code> for the initial single-reference refinement, and then to <code>Yes</code> for the actual classification run.

Revision as of 11:03, 25 November 2011

Experimental images

RELION reads the following image file formats:

  • MRC individual images (with extension .mrc)
  • MRC stacks (with extension .mrcs) (recommended)
  • SPIDER individual images (with extension .spi)
  • SPIDER stacks (with extension .spi)

Preparation of the images is explained on the Preprocess images page. Further note that images should be square (i.e. xdim=ydim).

If no CTF-correction is to be performed inside RELION, then a stack of images may be used directly as input (command line option --i). In that case, it is recommended that the images are CTF-phase flipped before refinement. If CTF-correction is to be performed inside RELION (recommended for cryo-data), then besides the images themselves, also metadata regarding the CTFs needs to be provided. In that case, the input to RELION is done using a STAR file (see below).

Metadata STAR files

The STAR file format is explained on the Conventions page. STAR files are easily readable plain text files, for which shell utilities like awk are very convenient. However, because not all users will be equally proficient in shell scripting, RELION comprises several shell script implementations to provide some basic operations with STAR files. See the STAR file utilities page for a description of these utilities, of which relion_star_loopheader, relion_star_datablock_stack and relion_star_datablock_singlefiles are used below.

If the input images are in a separate stack for each micrograph, then one could use the following commands to generate the input STAR file:

relion_star_loopheader rlnImageName rlnMicrographName rlnDefocusU rlnDefocusV rlnDefocusAngle rlnVoltage rlnSphericalAberration rlnAmplitudeContrast > my_images.star
relion_star_datablock_stack 4 mic1.mrcs mic1.mrcs 10000 10500 30 200 2 0.1  >> my_images.star
relion_star_datablock_stack 3 mic2.mrcs mic2.mrcs 21000 20500 25 200 2 0.1  >> my_images.star
relion_star_datablock_stack 2 mic3.mrcs mic3.mrcs 16000 15000 35 200 2 0.1  >> my_images.star

(Where the three stacks contain respectively 4, 3 and 2 images.) This would result in this STAR file that could be used directly as input into RELION. Note the rlnMicrographName label, and the repetition of the micrograph names on the datablock lines, which will lead to the inclusion of a unique rlnMicrographName for each micrograph. By doing so, distinct noise spectra will be estimated for each micrograph.

If the input images are in single-file format in distinct directories for each micrograph, then the commands would be:

relion_star_loopheader rlnImageName rlnMicrographName rlnDefocusU rlnDefocusV rlnDefocusAngle rlnVoltage rlnSphericalAberration rlnAmplitudeContrast > my_images.star
relion_star_datablock_singlefiles "mic1/*.spi" mic1 16000 15000 35 200 2 0.1  >> my_images.star
relion_star_datablock_singlefiles "mic2/*.spi" mic2 16000 15000 35 200 2 0.1  >> my_images.star
relion_star_datablock_singlefiles "mic3/*.spi" mic3 16000 15000 35 200 2 0.1  >> my_images.star

And the result would be this equivalent STAR file.

To generate a STAR file from an XMIPP-style ctfdat file, one could use:

relion_star_loopheader rlnImageName rlnMicrographName rlnDefocusU rlnDefocusV rlnDefocusAngle rlnVoltage rlnSphericalAberration rlnAmplitudeContrast>  all_images.star
relion_star_datablock_ctfdat all_images.ctfdat>>  all_images.star

Reference images

2D class averaging is typically performed in an unsupervised manner, i.e. without user-provided references. 3D reconstruction does require a (single) 3D reference structure. This map should be provided in MRC or SPIDER format, and it should have the same dimensions as the input images. Take care that the pixel size (in Angstroms) matches that of the experimental images, as currently an internal magnification correction is not implemented. Because the Gaussian model used to calculate probabilities is based on the squared differences between the experimental images and projections of the reference, the absolute intensity scale (or grey-scale) of the reference map is relevant. However, RELION may correct for the greyscale internally at relatively small computational costs.

To limit model bias it is generally recommended to strongly low-pass filter your initial reference model. The Optimisation tab in the GUI has an entry to set an initial low-pass filter.

Important note for 3D classification

Although for 3D classification an external reference may work OK (as illustrated in the [guided 3D classification example]), one often gets better results by starting classification from a consensus model that was generated from the structurally heterogeneous data set itself. To that purpose, one may refine (for multiple iterations) the external reference (as a single class) against the entire data set. The resulting model may then be used to generate the random seeds and classify the data using multiple classes.

Note that the option Ref. map is on absolute greyscale? in the GUI should then probably be set to No for the initial single-reference refinement, and then to Yes for the actual classification run.