Analyse results

From Relion
Jump to navigation Jump to search

Output files

For every iteration, RELION will output the following files:

  • rootname_it???_class???.mrc (or rootname_it???_class???.mrcs for 2D refinements) with the images of the refined 2D/3D structures.
  • rootname_it???_optimiser.star with general information about the optimisation process.
  • rootname_it???_model.star with information about the refined model parameters apart from the images (e.g. the noise spectra, the spherical average of the signal-to-noise ratios in the reconstructed structures, the distribution of the images over the classes, the angular distributions, etc.
  • rootname_it???_data.star with for each particle: information about their CTF, optimal orientation, translation and class assignment, normalisation correction, height of the (normalised) probability distribution, etc
  • rootname_it???__sampling.star with information about the angular and translational sampling


Note that for the 3D auto-refine procedure a gold-standard FSC procedure is employed, where two models are refined independently. This will lead to the following model and class files:

  • rootname_it???_half1_class001.mrc and rootname_it???_half2_class001.mrc
  • rootname_it???_half1_model.star and rootname_it???_half2_model.star

Only upon convergence of the 3D auto-refine procedure is a single reconstruction made from all particles. This reconstruction and the corresponding model.star file are called:

  • rootname_class001.mrc
  • rootname_model.star

After this final reconstruction, no optimiser.star file is written out, because this reconstruction may no longer be used in further refinement. If your 3D auto-refine run has not produced the final model files (i.e. without the "it???" statement), then it has not finished yet. Check your stderr and stdout to see what has happened and re-start the refinement from the last performed iteration using the "Continue old run" option from the top of the GUI. The correct resolution of your final map is as indicated in the rootname_model.star file, NOT as in the rootname_it???_half?_model.star files!

What to look out for

There are many things one may want to monitor. The most common indicators to look out for are given below. For a complete list of all metadata labels, on the command-line type: relion_refine --print_metadata_labels.

Monitor resolution

The highest resolution for which at least one of the models has SSNR^MAP>1 is stored as _rlnCurrentResolution in the _optimiser.star files. Therefore, progress in terms of resolution may be monitored using:

grep rlnCurrentResolution *model.star


The fall-off of SSNR^MAP with resolution for each model is stored in the tables called data_model_class_N (with N being the number of the corresponding class) in the _model.star files. It is often insightful to have a look at these tables in the files themselves. Alternatively, one may use the STAR file utilities to make a plot (requires gnuplot to be installed), using:


relion_star_plottable rootname_it025_model.star data_model_class_1 rlnSsnrMap rlnResolution


Although it may be better to rerun gnuplot and plot the produced plot with a limited Y-range:

gnuplot
gnuplot> set yrange [0:20]
gnuplot> load "gnuplot.plt"


Alternatively, one may use the relion_star_printtable script to make a text file with any table for visualization in an alternative plotting program. For example, one could visualize the final gold-standard FSC curve from a 3D auto-refine procedure using:


relion_star_printtable rootname_model.star data_model_class_1 rlnResolution rlnGoldStandardFsc  >  goldstandardFSC.dat
xmgrace goldstandardFSC.dat

Monitor convergence behaviour

The 3D auto-refine option (but not the 2D or 3D classification) writes out general information about convergence (such as angular sampling used, estimated accuracy of the angular assignments, resolution, etc) in the stdout file. One can get a quick overview using:


grep Auto myrun.out

Monitor changes in particle orientations and class assignments

The changes in optimal (i.e. with the highest posterior probability) class and orientation assignments may be monitored using:


grep rlnChangesOptimalOffsets *model.star
grep rlnChangesOptimalOrientations *model.star
grep rlnChangesOptimalClasses *model.star

Monitor class distribution

Likewise, the distribution of all images over the various classes is stored in the table called data_model_classes in the _model.star files. Again, looking in the file itself may be easiest, or alternatively a plot may be made using the relion_star_plottable script:


relion_star_plottable rootname_it025_model.star data_model_classes rlnClassDistribution

Angular distribution

The angular distribution for each model is stored in the table data_model_pdf_orient_class_N in the _model.star files, with N being the class number. The actual angles (rot and tilt) for these values are stored in the rootname_sampling.star file. combination of these two files could be used to make fancy plots. As of release 1.3, RELION also writes out BILD files for each 3D model. These BILD files can be visualized in UCSF Chimera to assess the angular distribution.

Per-image indicators

The _data.star files store per-image indicators. Apart from the obvious orientational and class assignments, look out for the following variables:

  • _rlnMaxValueProbDistribution: values close to one indicate that the probability distributions over all orientations and classes have converged to near-delta functions. Values close to zero indicate great uncertainty in the assignments (i.e. near-even distributions). Typically, these values increase during multiple iterations (with constant sampling rates)
  • _rlnLogLikeliContribution: higher values mean better agreement between experimental data and model. This variable is the equivalent of the cross-correlation coefficient or phase residual in other programs. One could make histograms of all values in order to discard particularly bad images.

Linear plots of these values for each image may be made using:

relion_star_plottable rootname_it025_data.star data_images _rlnLogLikeliContribution

Select images based on per-image indicators

Sometimes it is useful to select a subset of images for further refinement based on their per-image indicators. For example, all images belonging to a certain class, or all images with a minimum contribution to the LogLikelihood. See the FAQs page how to do that.

Getting higher resolution and map sharpening

As of release 1.2. RELION has a program for semi-automated map postprocessing, called relion_postprocess. It may be used after a 3D auto-refine calculation for automated masking, MTF-correction and B-factor sharpening. Because the FSC calculations inside the 3D auto-refine procedure depend on unmasked maps (to avoid potential overfitting), the auto-refine procedure may somewhat under-estimate the true resolution. Because orientational assignments are mostly determined by the medium-low resolution frequencies, the final result is typically not much affected by this under-estimation. On the contrary, for many data sets keeping overfitting at bay is more important in squeezing out the most signal from the data. Therefore, as long as one determines the true resolution of the reconstruction after refinement, and filters the map correspondingly, top-quality reconstructions may be obtained. To this purpose, as of release 1.2 RELION also writes out unfiltered reconstructions that are calculated all the way out to the Nyquist frequency (called *_unfil.mrc) for the two independent half data sets. These maps are read by the relion_postprocess program and used for masking and sharpening. The procedure consists of several consecutive steps:

  • Automatically determine a mask from the reconstructions (using the --auto_mask option, type relion_postprocess without arguments to see a complete list of options) or provide your own mask using --mask.
  • Mask both half-reconstructions, and calculate the Fourier Shell Correlation curve between them.
  • Measure the inflating effect that the mask may have on the FSC curve. To this purpose, we use a method called high-resolution noise substitution (Chen et al (2013) Ultramicroscopy). It works as follows: Randomize the phases of the two unfiltered maps before masking; mask these two scrambled maps with the same mask; calculate a FSC curve between them; assess the masking-effects on the FSC by analysing the resolution-dependent difference between the unscrambled and scrambled FSC curves as described in the Chen al. paper; output a masking-effect-corrected FSC curve (called rlnFourierShellCorrelationCorrected).
  • If you know the MTF of your detector you can correct for it prior to B-factor sharpening (if you do not know it, you can only perform B-factor sharpening, thereby implicitly assuming the MTF is Gaussian shaped). This curve should be provided in the format of a STAR file. It should look like (Note that the ... should be filled in with many more values!):
data_
loop_
_rlnResolutionInversePixel
_rlnMtfValue
0.0  1.000
0.0005 0.999939
0.001 0.999755
...
0.499 0.052100
0.4995 0.051970
0.5 0.051841

These are our MTF-curves for the Falcon-II at 300kV, the DE-20 at 300kV and the K2-summit at 300kV.

  • Apply a B-factor sharpening to the map. For maps with resolutions significantly beyond 10 Angstrom one can use the automated procedure as described by Rosenthal and Henderson (2003) (using the --auto_bfac> option. Alternatively, one may provide a user-determined (negative!) B-factor using --adhoc_bfac.
  • By default, the program will use the corrected FSC curve to do FSC-based low-pass filtering, but this may be switched off by using the --skip_fsc_weighting option, most likely in combination with a user-determined low-pass filter frequency provided as --low_pass. The soft edge of the latter will be a raised cosine function (with a tunable width of --filter_edge_width resolution shells).

As of relion-1.3, the postprocessing may be run entirely from the GUI. The resulting post-processed map may typically be used directly for fitting atomic models, displaying in Chimera, etc.

Local resolution estimation

As unresolved structural heterogeneity is still present in many data sets, local resolution estimation is an important aspect of data analysis. As of release 1.3, RELION implements a convenient wrapper to the ResMap program written by Alp Kucukelbir. It takes the unfil.mrc maps above as input. Also, it is highly recommended to use the automask as determined in the postprocessing step above as input mask into ResMap. The resulting _resmap.mrc file can be used in UCSF Chimera to color the sharpenend (or unsharpened) maps according to local resolution (using Volume data -> Surface Color -> according to data value ).