Analyse results: Difference between revisions

From Relion
Jump to navigation Jump to search
Line 97: Line 97:
= Getting higher resolution and map sharpening=
= Getting higher resolution and map sharpening=


As of release 1.2. RELION has a program for semi-automated map postprocessing, called \verb{relion_postprocess}. It may be used after a 3D auto-refine calculation for automated masking, MTF-correction and B-factor sharpening. Because the FSC calculations inside the 3D auto-refine procedure depend on unmasked maps (to avoid potential overfitting), the auto-refine procedure may somewhat under-estimate the true resolution. Because orientational assignments are mostly determined by the medium-low resolution frequencies, the final result is typically not much affected by this under-estimation. On the contrary, for many data sets keeping overfitting at bay is more important in squeezing out the most signal from the data. Therefore, as long as one determines the true resolution of the reconstruction after refinement, and filters the map correspondingly, top-quality reconstructions may be obtained. To this purpose, as of release 1.2 RELION also writes out unfiltered reconstructions that are calculated all the way out to the Nyquist frequency (called *_unfil.mrc) for the two independent half data sets. These maps are read by the relion_postprocess program and used for masking and sharpening. The procedure consists of several consecutive steps:
As of release 1.2. RELION has a program for semi-automated map postprocessing, called <code>relion_postprocess</code>. It may be used after a 3D auto-refine calculation for automated masking, MTF-correction and B-factor sharpening. Because the FSC calculations inside the 3D auto-refine procedure depend on unmasked maps (to avoid potential overfitting), the auto-refine procedure may somewhat under-estimate the true resolution. Because orientational assignments are mostly determined by the medium-low resolution frequencies, the final result is typically not much affected by this under-estimation. On the contrary, for many data sets keeping overfitting at bay is more important in squeezing out the most signal from the data. Therefore, as long as one determines the true resolution of the reconstruction after refinement, and filters the map correspondingly, top-quality reconstructions may be obtained. To this purpose, as of release 1.2 RELION also writes out unfiltered reconstructions that are calculated all the way out to the Nyquist frequency (called <code>*_unfil.mrc</code>) for the two independent half data sets. These maps are read by the <code>relion_postprocess</code> program and used for masking and sharpening. The procedure consists of several consecutive steps:


* Automatically determine a mask from the reconstructions (using the --auto_mask option, type relion_postprocess without arguments to see a complete list of options) or provide your own mask (using --mask)
* Automatically determine a mask from the reconstructions (using the <code>--auto_mask</code> option, type <ode>relion_postprocess</code>a without arguments to see a complete list of options) or provide your own mask (using <code>--mask</code>).


It will re-estimate resolution after masking (because
* Mask both half-reconstructions, and calculate the Fourier Shell Correlation curve between them.
inside RELION refinement the resolution is always estimated based on gold-
standard FSCs BEFORE masking, which slightly under-estimates the true res-
olution) by a procedure described in Chen et al., Ultramicroscopy, in press. To
run it, just type:
relion_postprocess --i Refine3D/run1 --o Refine3D/postprocess_run1
--angpix 3.54 --mtf mtf_falcon_300kV.star --auto_mask --auto_bfac
The output of relion_postprocess are MRC maps for the sharpened and
optimally filtered map (both masked and unmasked), as well as for the mask
itself. The maps may be directly used for visualization and model building in
your favourite 3D viewer. Remember to also look at your map in slices: do you
still see indications for unresolved heterogeneity? If so, see section 6 below for
further classification tricks. The estimated FSC curves are stored in a STAR file,
17
and the frequency where the rlnFourierShellCorrelationCorrected drops
below 0.143 is defined as the estimated resolution [3].


We have now observed that the 3D auto-refine procedure in RELION (as published in [http://dx.doi.org/10.1016/j.jsb.2012.09.006 JSB]) somewhat '''under-estimates the true resolution'''. This is because in this procedure masking of the real-space maps was omitted in order to completely avoid overfitting. However, as , it actually does not matter than during the refinement the resolution is somewhat under-estimated.To this purpose, as of release 1.2 RELION also writes out unfiltered reconstructions (called *_unfil.mrc) for the two independent half data sets. Upon convergence, the maps of the last iteration may be masked (using other programs) and then a gold-standard FSC curve may be calculated on the masked maps. In this case, care should be taken not to use a too tight mask or a very sharp mask, as this may introduce artifacts in the FSC curve, in particular at the high-resolution end (e.g. it may go up again). These "masked" FSC-curves may then also be used in the sharpening procedure, as describe by [http://dx.doi.org/10.1016/j.jmb.2003.07.013 Rosenthal and Henderson (2003)]. For example, we have used the following procedure that exploits xmipp v2.4 in our eLife paper on high-resolution 80S maps:
* Measure the inflating effect that the mask may have on the FSC curve. To this purpose, we use a method called high-resolution noise substitution (Shaoxia Chen et al., ''Ultramicroscopy'', in press). It works as follows: Randomize the phases of the two unfiltered maps before masking; mask these two scrambled maps with the same mask; calculate a FSC curve between them; output a masking-effect-corrected FSC curve (called <code>rlnFourierShellCorrelationCorrected</code>) by assessing the difference between the unscrambled and scrambled FSC curves as described in the Chen al. paper; output a masking-effect-corrected FSC curve.


<code>
* Correct for the MTF of the detector. This curve should be provided in the format of a STAR file. It should look like:
xmipp_convert_spi22ccp4 -i  relion_it029_half1_class001_unfil.mrc -o half1.spi
xmipp_convert_spi22ccp4 -i  relion_it029_half2_class001_unfil.mrc -o half2.spi
xmipp_mask -i half1.spi -o half1_masked.spi -mask raised_cosine -90 -95
xmipp_mask -i half2.spi -o half2_masked.spi -mask raised_cosine -90 -95
xmipp_resolution_fsc -ref half2_masked.spi -i half1_masked.spi -sam 1.77
xmipp_operate -i half1.spi -plus half2.spi -o sumhalves.spi
xmipp_correct_bfactor -i sumhalves.spi -o sharpened.spi -sampling 1.77 -auto -maxres 4 -fsc half1_masked.spi.frc
</code>


The parameters of the mask should be chosen such that it does not cut off any signal, but excludes as much noisy background as possible. Also, the mask should be smooth enough so that no artifacts are introduced in the FSC curve (e.g. rising curves after going through a minimum). The maximum resolution to which to sharpen the map (<code>-maxres 4</code>) depends on the observed FSC curve and on the amount of noise in the sharpened map. One can also control the B-factor itself by using for example <code>-adhoc -500</code> instead of the <code>-auto</code> argument.
data_
loop_
_rlnResolutionInversePixel
_rlnMtfValue
0.0  1.000
0.0005 0.999939
0.001 0.999755
...
0.499 0.052100
0.4995 0.051970
0.5 0.051841


The sharpened map can then be used for fitting atomic models, displaying in Chimera, etc.
* Apply a B-factor sharpening to the map. For maps with resolutions significantly beyond 10 Angstrom one can use the automated procedure as described by [http://dx.doi.org/10.1016/j.jmb.2003.07.013 Rosenthal and Henderson (2003)] (using the <code>--auto_bfac> option. Alternatively, one may provide a user-determined (negative!) B-factor using <code>--adhoc_bfac</code>.  


For maps that do not extend beyond 10 Angstroms one may apply some adhoc negative B-factors. However, the effect of B-factor sharpening will be much smaller than for the higher resolution maps.
* By default, the program will use the corrected FSC curve to do FSC-based low-pass filtering, but this may be switched off by using the <code>--skip_fsc_weighting</code> option, most likely in combination with a user-determined low-pass filter frequency provided as <code>--low_pass</code>. The soft edge of the latter will be a raised cosine function (with a tunable width of <code>--filter_edge_width</code> resolution shells).
 
Typical usage of the program would be:
 
relion_postprocess --i Refine3D/run1 --o Refine3D/postprocess_run1 --angpix 1.77 --mtf mtf_falcon_300kV.star --auto_mask --auto_bfac
 
The resulting map typically be used directly for fitting atomic models, displaying in Chimera, etc.

Revision as of 10:35, 19 June 2013

Output files

For every iteration, RELION will output the following files:

  • rootname_it???_class???.mrc (or rootname_it???_class???.mrcs for 2D refinements) with the images of the refined 2D/3D structures.
  • rootname_it???_optimiser.star with general information about the optimisation process.
  • rootname_it???_model.star with information about the refined model parameters apart from the images (e.g. the noise spectra, the spherical average of the signal-to-noise ratios in the reconstructed structures, the distribution of the images over the classes, the angular distributions, etc.
  • rootname_it???_data.star with for each particle: information about their CTF, optimal orientation, translation and class assignment, normalisation correction, height of the (normalised) probability distribution, etc
  • rootname_it???__sampling.star with information about the angular and translational sampling


Note that for the 3D auto-refine procedure a gold-standard FSC procedure is employed, where two models are refined independently. This will lead to the following model and class files:

  • rootname_it???_half1_class001.mrc and rootname_it???_half2_class001.mrc
  • rootname_it???_half1_model.star and rootname_it???_half2_model.star

Only upon convergence of the 3D auto-refine procedure is a single reconstruction made from all particles. This reconstruction and the corresponding model.star file are called:

  • rootname_class001.mrc
  • rootname_model.star

After this final reconstruction, no optimiser.star file is written out, because this reconstruction may no longer be used in further refinement. If your 3D auto-refine run has not produced the final model files (i.e. without the "it???" statement), then it has not finished yet. Check your stderr and stdout to see what has happened and re-start the refinement from the last performed iteration using the "Continue old run" option from the top of the GUI. The correct resolution of your final map is as indicated in the rootname_model.star file, NOT as in the rootname_it???_half?_model.star files!

What to look out for

There are many things one may want to monitor. The most common indicators to look out for are given below. For a complete list of all metadata labels, on the command-line type: relion_refine --print_metadata_labels.

Monitor resolution

The highest resolution for which at least one of the models has SSNR^MAP>1 is stored as _rlnCurrentResolution in the _optimiser.star files. Therefore, progress in terms of resolution may be monitored using:

grep rlnCurrentResolution *model.star

The fall-off of SSNR^MAP with resolution for each model is stored in the tables called data_model_class_N (with N being the number of the corresponding class) in the _model.star files. It is often insightful to have a look at these tables in the files themselves. Alternatively, one may use the STAR file utilities to make a plot (requires gnuplot to be installed), using:

relion_star_plottable rootname_it025_model.star data_model_class_1 rlnSsnrMap rlnResolution

Although it may be better to rerun gnuplot and plot the produced plot with a limited Y-range:

gnuplot
gnuplot> set yrange [0:20]
gnuplot> load "gnuplot.plt"

Alternatively, one may use the relion_star_printtable script to make a text file with any table for visualization in an alternative plotting program. For example, one could visualize the final gold-standard FSC curve from a 3D auto-refine procedure using:

relion_star_printtable rootname_model.star data_model_class_1 rlnResolution rlnGoldStandardFsc  >  goldstandardFSC.dat
xmgrace goldstandardFSC.dat

Monitor convergence behaviour

The 3D auto-refine option (but not the 2D or 3D classification) writes out general information about convergence (such as angular sampling used, estimated accuracy of the angular assignments, resolution, etc) in the stdout file. One can get a quick overview using:

grep Auto myrun.out

Monitor changes in particle orientations and class assignments

The changes in optimal (i.e. with the highest posterior probability) class and orientation assignments may be monitored using:

grep rlnChangesOptimalOffsets *model.star
grep rlnChangesOptimalOrientations *model.star
grep rlnChangesOptimalClasses *model.star

Monitor class distribution

Likewise, the distribution of all images over the various classes is stored in the table called data_model_classes in the _model.star files. Again, looking in the file itself may be easiest, or alternatively a plot may be made using the relion_star_plottable script:

relion_star_plottable rootname_it025_model.star data_model_classes rlnClassDistribution

Angular distribution

The angular distribution for each model is stored in the table data_model_pdf_orient_class_N in the _model.star files, with N being the class number. The actual angles (rot and tilt) for these values are stored in the rootname_sampling.star file. combination of these two files could be used to make fancy plots, but no utility for this has been implemented yet.

Per-image indicators

The _data.star files store per-image indicators. Apart from the obvious orientational and class assignments, look out for the following variables:

  • _rlnMaxValueProbDistribution: values close to one indicate that the probability distributions over all orientations and classes have converged to near-delta functions. Values close to zero indicate great uncertainty in the assignments (i.e. near-even distributions). Typically, these values increase during multiple iterations (with constant sampling rates)
  • _rlnLogLikeliContribution: higher values mean better agreement between experimental data and model. This variable is the equivalent of the cross-correlation coefficient or phase residual in other programs. One could make histograms of all values in order to disard particularly bad images.

Linear plots of these values for each image may be made using:

relion_star_plottable rootname_it025_data.star data_images _rlnLogLikeliContribution

Select images based on per-image indicators

Sometimes it is useful to select a subset of images for further refinement based on their per-image indicators. For example, all images belonging to a certain class, or all images with a minimum contribution to the LogLikelihood. See the FAQs page how to do that.

Getting higher resolution and map sharpening

As of release 1.2. RELION has a program for semi-automated map postprocessing, called relion_postprocess. It may be used after a 3D auto-refine calculation for automated masking, MTF-correction and B-factor sharpening. Because the FSC calculations inside the 3D auto-refine procedure depend on unmasked maps (to avoid potential overfitting), the auto-refine procedure may somewhat under-estimate the true resolution. Because orientational assignments are mostly determined by the medium-low resolution frequencies, the final result is typically not much affected by this under-estimation. On the contrary, for many data sets keeping overfitting at bay is more important in squeezing out the most signal from the data. Therefore, as long as one determines the true resolution of the reconstruction after refinement, and filters the map correspondingly, top-quality reconstructions may be obtained. To this purpose, as of release 1.2 RELION also writes out unfiltered reconstructions that are calculated all the way out to the Nyquist frequency (called *_unfil.mrc) for the two independent half data sets. These maps are read by the relion_postprocess program and used for masking and sharpening. The procedure consists of several consecutive steps:

  • Automatically determine a mask from the reconstructions (using the --auto_mask option, type <ode>relion_postprocessa without arguments to see a complete list of options) or provide your own mask (using --mask).
  • Mask both half-reconstructions, and calculate the Fourier Shell Correlation curve between them.
  • Measure the inflating effect that the mask may have on the FSC curve. To this purpose, we use a method called high-resolution noise substitution (Shaoxia Chen et al., Ultramicroscopy, in press). It works as follows: Randomize the phases of the two unfiltered maps before masking; mask these two scrambled maps with the same mask; calculate a FSC curve between them; output a masking-effect-corrected FSC curve (called rlnFourierShellCorrelationCorrected) by assessing the difference between the unscrambled and scrambled FSC curves as described in the Chen al. paper; output a masking-effect-corrected FSC curve.
  • Correct for the MTF of the detector. This curve should be provided in the format of a STAR file. It should look like:
data_
loop_
_rlnResolutionInversePixel
_rlnMtfValue
0.0  1.000
0.0005 0.999939
0.001 0.999755
...
0.499 0.052100
0.4995 0.051970
0.5 0.051841
  • Apply a B-factor sharpening to the map. For maps with resolutions significantly beyond 10 Angstrom one can use the automated procedure as described by Rosenthal and Henderson (2003) (using the --auto_bfac> option. Alternatively, one may provide a user-determined (negative!) B-factor using --adhoc_bfac.
  • By default, the program will use the corrected FSC curve to do FSC-based low-pass filtering, but this may be switched off by using the --skip_fsc_weighting option, most likely in combination with a user-determined low-pass filter frequency provided as --low_pass. The soft edge of the latter will be a raised cosine function (with a tunable width of --filter_edge_width resolution shells).

Typical usage of the program would be:

relion_postprocess --i Refine3D/run1 --o Refine3D/postprocess_run1 --angpix 1.77 --mtf mtf_falcon_300kV.star --auto_mask --auto_bfac

The resulting map typically be used directly for fitting atomic models, displaying in Chimera, etc.