Analyse results

From Relion
Jump to navigation Jump to search

Output files

For every iteration, RELION will output the following files:

  • rootname_it???_class???.mrc (or rootname_it???_class???.mrcs for 2D refinements) with the images of the refined 2D/3D structures.
  • rootname_it??? with general information about the optimisation process.
  • rootname_it??? with information about the refined model parameters apart from the images (e.g. the noise spectra, the spherical average of the signal-to-noise ratios in the reconstructed structures, the distribution of the images over the classes, the angular distributions, etc.
  • rootname_it??? with for each particle: information about their CTF, optimal orientation, translation and class assignment, normalisation correction, height of the (normalised) probability distribution, etc
  • rootname_it??? with information about the angular and translational sampling

Note that for the 3D auto-refine procedure a gold-standard FSC procedure is employed, where two models are refined independently. This will lead to the following model and class files:

  • rootname_it???_half1_class001.mrc and rootname_it???_half2_class001.mrc
  • rootname_it??? and rootname_it???

Only upon convergence of the 3D auto-refine procedure is a single reconstruction made from all particles. This reconstruction and the corresponding file are called:

  • rootname_class001.mrc

After this final reconstruction, no file is written out, because this reconstruction may no longer be used in further refinement. If your 3D auto-refine run has not produced the final model files (i.e. without the "it???" statement), then it has not finished yet. Check your stderr and stdout to see what has happened and re-start the refinement from the last performed iteration using the "Continue old run" option from the top of the GUI. The correct resolution of your final map is as indicated in the file, NOT as in the rootname_it???_half? files!

What to look out for

There are many things one may want to monitor. The most common indicators to look out for are given below. For a complete list of all metadata labels, on the command-line type: relion_refine --print_metadata_labels.

Monitor resolution

The highest resolution for which at least one of the models has SSNR^MAP>1 is stored as _rlnCurrentResolution in the files. Therefore, progress in terms of resolution may be monitored using:

grep rlnCurrentResolution *

The fall-off of SSNR^MAP with resolution for each model is stored in the tables called data_model_class_N (with N being the number of the corresponding class) in the files. It is often insightful to have a look at these tables in the files themselves. Alternatively, one may use the STAR file utilities to make a plot (requires gnuplot to be installed), using:

relion_star_plottable data_model_class_1 rlnSsnrMap rlnResolution

Although it may be better to rerun gnuplot and plot the produced plot with a limited Y-range:

gnuplot> set yrange [0:20]
gnuplot> load "gnuplot.plt"

Alternatively, one may use the relion_star_printtable script to make a text file with any table for visualization in an alternative plotting program. For example, one could visualize the final gold-standard FSC curve from a 3D auto-refine procedure using:

relion_star_printtable data_model_class_1 rlnResolution rlnGoldStandardFsc  >  goldstandardFSC.dat
xmgrace goldstandardFSC.dat

Monitor convergence behaviour

The 3D auto-refine option (but not the 2D or 3D classification) writes out general information about convergence (such as angular sampling used, estimated accuracy of the angular assignments, resolution, etc) in the stdout file. One can get a quick overview using:

grep Auto myrun.out

Monitor changes in particle orientations and class assignments

The changes in optimal (i.e. with the highest posterior probability) class and orientation assignments may be monitored using:

grep rlnChangesOptimalOffsets *
grep rlnChangesOptimalOrientations *
grep rlnChangesOptimalClasses *

Monitor class distribution

Likewise, the distribution of all images over the various classes is stored in the table called data_model_classes in the files. Again, looking in the file itself may be easiest, or alternatively a plot may be made using the relion_star_plottable script:

relion_star_plottable data_model_classes rlnClassDistribution

Angular distribution

The angular distribution for each model is stored in the table data_model_pdf_orient_class_N in the files, with N being the class number. The actual angles (rot and tilt) for these values are stored in the file. combination of these two files could be used to make fancy plots, but no utility for this has been implemented yet.

Per-image indicators

The files store per-image indicators. Apart from the obvious orientational and class assignments, look out for the following variables:

  • _rlnMaxValueProbDistribution: values close to one indicate that the probability distributions over all orientations and classes have converged to near-delta functions. Values close to zero indicate great uncertainty in the assignments (i.e. near-even distributions). Typically, these values increase during multiple iterations (with constant sampling rates)
  • _rlnLogLikeliContribution: higher values mean better agreement between experimental data and model. This variable is the equivalent of the cross-correlation coefficient or phase residual in other programs. One could make histograms of all values in order to disard particularly bad images.

Linear plots of these values for each image may be made using:

relion_star_plottable data_images _rlnLogLikeliContribution

Select images based on per-image indicators

Sometimes it is useful to select a subset of images for further refinement based on their per-image indicators. For example, all images belonging to a certain class, or all images with a minimum contribution to the LogLikelihood. See the FAQs page how to do that.

Getting higher resolution and map sharpening

We have now observed that the 3D auto-refine procedure in RELION (as published in JSB) somewhat under-estimates the true resolution. This is because in this procedure masking of the real-space maps was omitted in order to completely avoid overfitting. However, as orientational assignments are mostly based on the lower-resolution frequencies, it actually does not matter than during the refinement the resolution is somewhat under-estimated. As long as one determines the true resolution of the reconstruction after refinement, and filters the map correspondingly, top-quality reconstructions may be obtained. To this purpose, as of release 1.2 RELION also writes out unfiltered reconstructions (called *_unfil.mrc) for the two independent half data sets. Upon convergence, the maps of the last iteration may be masked (using other programs) and then a gold-standard FSC curve may be calculated on the masked maps. In this case, care should be taken not to use a too tight mask or a very sharp mask, as this may introduce artifacts in the FSC curve, in particular at the high-resolution end (e.g. it may go up again). These "masked" FSC-curves may then also be used in the sharpening procedure, as describe by Rosenthal and Henderson (2003). For example, we have used the following procedure that exploits xmipp v2.4 in our eLife paper on high-resolution 80S maps:

xmipp_convert_spi22ccp4 -i  relion_it029_half1_class001_unfil.mrc -o half1.spi
xmipp_convert_spi22ccp4 -i  relion_it029_half2_class001_unfil.mrc -o half2.spi
xmipp_mask -i half1.spi -o half1_masked.spi -mask raised_cosine -90 -95
xmipp_mask -i half2.spi -o half2_masked.spi -mask raised_cosine -90 -95
xmipp_resolution_fsc -ref half2_masked.spi -i half1_masked.spi -sam 1.77

xmipp_operate -i half1.spi -plus half2.spi -o sumhalves.spi
xmipp_correct_bfactor -i sumhalves.spi -o sharpened.spi -sampling 1.77 -auto -maxres 4 -fsc half1_masked.spi.frc

The parameters of the mask should be chosen such that it does not cut off any signal, but excludes as much noisy background as possible. Also, the mask should be smooth enough so that no artifacts are introduced in the FSC curve (e.g. rising curves after going through a minimum). The maximum resolution to which to sharpen the map (-maxres 4) depends on the observed FSC curve and on the amount of noise in the sharpened map. One can also control the B-factor itself by using for example -adhoc -500 instead of the -auto argument.

The sharpened map can then be used for fitting atomic models, displaying in Chimera, etc.

For maps that do not extend beyond 10 Angstroms one may apply some adhoc negative B-factors. However, the effect of B-factor sharpening will be much smaller than for the higher resolution maps.