Process movies

From Relion
Jump to navigation Jump to search

Get organised

Organize all your movies and the average micrographs (i.e. single-image files that are the average of all frames of your individual movies) inside one or more directories that are inside your RELION project directory (from where you would launch the GUI). We like to call these directories "Micrographs/" if all micrographs are in one directory, or "Micrographs/15jan13/" and "Micrographs/23jan13/" if they are in different directories (e.g. because they were collected on different dates. But, you can organise yourself as you like.

Name the average micrographs and movies in the following way:


001.mrc
001_movie.mrcs
24.01.13.15.04.mrc
24.01.13.15.04_movie.mrcs
myverybest.mrc
myverybest_movie.mrcs
...


Note that all average micrographs are in MRC format with an .mrc extension, and all movies are MRC stacks with a .mrcs extension. It is important that movies have the same rootname (e.g. 001, or 24.01.13.15.04, or whatever you like) as their corresponding average micrographs, plus a movie-identifier (in this case "movie") that follows the rootname after an additional "_" character.

What to do first

First, use the average micrographs to extract "normal" particle stacks, and follow the Recommended_procedures for 3D refinement.

Note that in RELION-2.0, the movie-processing procedures have been much simplified. Run as usual with the average micrographs, and then use the "Movie-refinement" job-type on the GUI to extract the movie-particles and perform the continuation of the 3D auto-refine run in one go. This may then be followed by particle polishing as explained below.

Preprocessing movies

Then, extract (box out) the individual movie-frames of all particles from the original micrograph movies. To do this, go back to the Preprocessing run-type in the RELION GUI, and:

  * Keep everything on the I/O tab as it was when you extracted the "normal" particles   (also do NOT change the rootname)
  * Set "Run Ctffind3?" to No
  * Set "Extract particles from micrographs?" to Yes
  * Set "Use movies instead of micrographs?" to Yes
  * Set "Rootname of movie files" to the movie-identifier mentioned above (in the example above "movie")
  * Set "Number of frames to average?" to any desired number from 1 to half the number of frames in your movies
  * Keep everything on the operate tab as it was when you extracted the "normal" particles   
  * Let it run... !

(By this time you will probably already be aware of disc quota, full discs, etc....) For each movie, a separate stack with all movie frames of all particles in that micrograph will be stored along the extracted "normal" particle stacks in the Particles directory. All metadata regarding these particle-movies will be stored in a large STAR file in your project directory that is called particles_movie.star (where "particles" is the Particle rootname given on the I/O tab; and "movie" is your movie-identifier).

3D refinement with movies

Finally, go to the 3D auto-refine run-type in the GUI and load the settings of the run that you want to perform movie-processing on. Note that this previous 3D auto-refine run was done with (most likely a classified subset of) the average-micrographs particles. As long as the naming conventions outlines above are maintained, then the program will figure out itself which movie particles correspond to which average-particle in the 3D auto-refine. In the 3D auto-refine GUI:

  * Select "continue old run" from the pull-down menu on the top.
  * Possibly change the "Output rootname" (also see below)
  * Select the last optimiser.star file from the run at the "Continue from here" option
  * Under the Optimisation tab, set "Realign movie frames?" to Yes.
  * Set "Input movie frames" to the particles_movie.star file generated in the section above.
  * Select suitable parameters for "Running avg window" and the standard deviations of the priors on the rotations and translations. 
  * If you plan to use these movies for subsequent particle polishing (see below), then you can skip the rotational searches
  * Keep everything on the CTF and Sampling tabs unchanged
  * Let it run... !

Regarding the parameter choices on the Movie tab: the defaults worked well for our ribosome data, but this will depend strongly on your total dose, how much your particles move etc. One can run this procedure multiple times (if you change the "Output rootname" on the I/O tab, previous results will not be overwritten). You can then select optimal parameter settings based on the gold-standard FSC curve.

Note that in many cases previous 2D and 3D classification has led to a much smaller subset of the particles being present in the auto-refinement from which one continues the movie refinement than the original set in the particles_movie.star. Therefore, one can significantly accelerate the expansion of the movie-frames step in the movie refinement by providing a particles_movie_subset.star file which contains only movie frames of that subset. For this we use the make_movie_subset.csh script. Just execute this as follows from the command line:

./make_movie_subset.csh Refine3D/run1_data.star particles_movie.star particles_movie_subset.star

And then select the much smaller particles_movie_subset.star file instead of the particles_movie.star file in the movie-refinement GUI.

Particle polishing

For relatively small particles (e.g. sub-MegaDalton), following beam-induced movements by aligning particles in running averages of a few movie frames may become very noisy. To make this procedure more robust, a so-called particle-polishing procedure was implemented in RELION-1.3. It improves the processing in three ways:

  1. Linear movement tracks are fitted through the (possibly very noisy) movement tracks as determined by the original movie-refinement (as outlined above). To further increase robustness to noisy tracks, multiple neighbouring particles on the same micrograph are considered simultaneously in this fitting. This reduces noise, since neighbouring particles are often observed to move in similar directions. To still be able to model complicated movement patterns, where particles in different areas of the micrograph move in different directions, the user defines a standard deviation of the particle-distance, which is used to calculate a Gaussian weight on the least-squares fit of these multiple particles. Thereby, particles far away from each other contribute little to each others fitted tracks, whereas particles within 1-2 standard deviations distance from each other still contribute significantly to each others fitted tracks.
  2. A resolution and dose-dependent weighting scheme is devised to model unresolved beam-induced movements and radiation damage. Radiation damage affects the high-resolution components at much lower electron dose than the low-resolution components. Consequently, whereas the high-resolution signal in the later frames may be completely gone, these frames may still contribute positively to the low-resolution signal (and thereby the orientability of the particles). The resolution-dependent decay of each movie frame is modelled by a B-factor, which is estimated from the gold-standard FSC between individual movie-frame reconstructions. These FSCs are converted into relative Guinier plots, which are the logarithm of the amplitudes of the individual-frame reconstructions divided by the amplitudes of the average reconstruction from all frames versus the square of the resolution. Linear fits through these Guinier plots (which may often be performed for resolution higher than 20 Angstroms) then yield a slope (the B-factor) and an intercept (a resolution-independent scale-factor) which are used to device the resolution-dependent weghting scheme. Often, the B-factors are relatively large during the first few electrons/squared Angstroms dose due to unresolved, large initial beam-induced movements, and then also become relatively large again in the later frames when radiation damage sets in. A similar behaviour is also often seen for the intercepts, although sometimes for those the initial frames are as good as the middle ones.
  3. All movie-frames for each particle are then summed together, taking the fitted movement tracks (of beam-induced translations) as well as the resolution-dependent weights into account. The result is a set of shiny or polished particles, which have increased signal-to-noise ratios compared to the particles that were extracted from the average micrographs of the original movies. Re-classification of these polished particles (in 2D or in 3D) may often work better than with the original, averaged particles, such that the final classified data set may be further improved.

In practice, the procedure is performed on the output _data.star file from the 3D-movie refinement above (e.g. Refine3D/run1_ct22_data.star). Note that when performing the original 3D movie-refinement, the rotational searches may be skipped, since in the polishing procedure beam-induced rotations will be ignored in the summation of the aligned movie frames anyway. If one skips the rotational searches, the 3D movie-refinement will be much quicker, and by default also the maximization (i.e. reconstruction) step is skipped, which further accelerates the procedure. In that case, only the _data.star file is output at the end of the 3D movie-refinement; and it is this file thats to be input in the particle polishing job-type window.

The following parameters are filled in on the GUI:

  • input STAR file: the output _data.star from the original 3D movie-refinement run. (PS: the one WITHOUT _it??? in its name!)
  • Mask for reconstructions: it is often good to provide the same mask that was calculated in the postprocessing of the map prior to movie-refinement. This mask is applied to the half-set individual-frame reconstructions, and the masked, gold-standard FSCs are then converted into the relative Guinier plots. Suitable masks yield gold-standard FSCs that extend to higher resolutions, and thus result in a wider range of resolution for fitting a straight line through the relative Guinier plot.
  • output rootname: Note that for each different output rootname, a copy of the particles stacks with this name will be created in the Particles/ directory. To save disc space, be careful not to copy your data too many times.
  • running average window: just re-provide the same value you gave for the 3D movie-processing refinement
  • stddev on particle distance: small values (e.g. 100 pixels) allow for complicated beam-induced movement patterns to be modelled accurately, although averaging over more particles with larger values (e.g. 300 pixels) may increase robustness to noise in the movement tracks from the original 3D movie-refinement.
  • High-res limit per-frame maps: by default the individual-frame reconstructions will be calculated until Nyquist. This may take considerable amounts of CPU time and RAM, whereas in many cases single-frame reconstructions do not reach Nyquist anyway. To decrease computational costs, one may provide a user-defined, fixed high-resolution limit (e.g. 6 Angstroms) to which the single-frame reconstructions are limited.
  • Low-res limit B-factor estimation: The per-frame B-factors are calculated by fitting straight lines through so-called relative Guinier plots (see above). The way these relative Guinier plots are calculated often results in useful linear regions to resolutions as low as 20 Angstroms. Manual inspection of these curves (which are stored in frame0??_guinier.star) is often useful to confirm that fitting a straight line through the specified range in the relative Guinier plot is indeed useful.
  • Average frames B-factor estimation: In some cases, especially when the resolution of the reconstructions isn't very high, the B-factor plot looks very ugly, when B-factors are estimated from reconstructions of individual movie-frames. This can be caused by a mere lack of signal-to-noise in the individual movie frames. In such cases, one may get a much improved B-factor plot by calculating the B-factors from running averages of 3 or more frames (always use an odd value). The downside to this is that rapid changes in B-factors, for example during the first few movie frames were movements are very rapid, can no longer be modelled. Therefore, this option is usually only applied when needed (i.e. after a run with a value of 1 gave sub-optimal B-factors).
  • Additional arguments: Sometimes it is useful to calculate reconstructions as running averages of 3 or 5 movie frames, instead of performing the B-factor estimation on individual-frame reconstructions. This is not implemented on the GUI, but can be done by providing for example --bfactor_running_avg 3 as an additional argument.

When submitting your job, make sure you have enough computer memory to perform the memory-intensive reconstructions of all movie frames. After the job finishes, you may want to look at the following results (assuming we used shiny as our output rootname, and Refine3D/run1_ct22_data.star as our input STAR file with the aligned movie frames:

  • Refine3D/run1_ct22_data_shiny.star has the fitted linear movement tracks for all particles. We visualise both these fitted tracks and the original movement tracks from the 3D movie-refinement (Refine3D/run1_ct22_data.star) by using this script and reading its output into xmgrace.
  • Refine3D/run1_ct22_data_shiny_frame0??_half?_class001_unfil.mrc: are the half-set reconstructions for each of the movie frames (or the centers of the running averages of those). Inspection of these by displaying them in slices (through the Display button on the GUI) or by volume rendering in UCSF Chimera may be useful to confirm no disasters have taken place. One may also run relion_postprocess on these half-maps.
  • Refine3D/run1_ct22_data_shiny_frame0??_guinier.star have the relative Guinier plots for each movie frame. Visualise these in your favourite 2D plotting program to confirm that fitting a straight line through them is a good thing.
  • Refine3D/run1_ct22_data_shiny_bfactors.star has the fitted B-factors (4 times the slope of the fitted line through the relative Guinier plot) and intercepts for each movie frame (or the centers of the running averages of several movie frames). Confirm that the behaviour is as expected: relatively large Bfactors in the beginning and at the end of the movies, and decreasing intercepts throughout the movie. You do not want plots of B-factor of intercept versus movie-frame number to look very noisy. In that case, confirm the low-resolution resolution of the fitting range, or possibly consider using running averages of 3 or 5 movie frames to estimate these B-factors.
  • Refine3D/run1_ct22_data_shiny_relweights.star has a table for each movie-frame with the resolution-dependent relative weights. They may be useful to understand the effect of your fitted B-factors and Guinier-plot intercepts, but are not so useful in assessing whether your job has gone well or not.
  • ./shiny.star in the Project Directory is the STAR file with all the output polished particles. It points to all new MRC stacks that were created in your Particles/ directory. You can display this STAR file from the GUI to confirm that the particles look good, and then use it for subsequent classification. Even if one does not wish to re-classify the data, one should always re-run the 3D auto-refinement with the polished particles. This often leads to a significant improvement in resolution when compared to the reported resolution in the particle-polishing program itself. This improvement comes from the fact that the polished particles may be aligned better than the original ones, AND because in the 3D auto-refinement probability-weighted angular assignments are made, whereas the particle-polishing procedure only calculates reconstructions with each particle in its most likely orientation.

Finally, the continue old run button on the particle-polish job-type window has a special use. It doesn't actually (de-)activate any of the input options on the GUI itself, but when you select continue old run, the program will look for any of the output files listed above. If they are already present, then that step of the procedure will be skipped. This is done until any output file is missing and then that step is performed. For example, if all the half-reconstructions have already been made, but the _bfactors.star is missing, then it will re-calculate the B-factors. This may be useful if one has only changed the lowest resolution that should be used in this fitting. If the shiny.star is also deleted, then the new polished particles (with the new B-factors) will be re-calculated again, but without re-calculating the (computation-intensive) individual-frame reconstructions. Another useful scenario is when the plots of the B-factors and intercepts against movie-frame number are very noisy. In that case, one could copy the _bfactors.star to _bfactors.star.original, and delete the shiny.star. One could then edit the _bfactors.star file and replace the noisy B-factors and/or intercept with values that were for example obtained by fitting a polynomial through the originally estimated values. Thereby, the entire weighting-procedure becomes easily adjustable by the user. If the continue old run button is not active (it says Start new run) then the polsihing procedure will re-start the calculating ignoring and over-writing any output files that are already there.