FAQs

From Relion
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Getting feedback

If you have a question, please check the FAQs and whether the topic has been addressed already on the CCP-EM mailing list. If not, use the CCP-EM list to send us your message. Please, do not send us direct e-mails with general questions about RELION.

FAQs

General

Could I get code that is not yet in beta-testing?

No. Only when the code is deemed to be stable enough, then we move from alpha-testing (strictly in-house only) to beta-testing (also external).

How do you use RELION?

Please go through our RELION tutorial. We also provide Recommended procedures but information might have been outdated.

How can I ask a question that is not dealt with in the FAQs?

If you have a question, please first check whether the topic has been addressed already on the CCP-EM mailing list. If not, please use this list (and not a direct e-mail) to send us your message.

Program crashes showing MPI runtime errors

Probably you are using a version of MPI runtime different from the one used during compilation. This is most likely caused by activating an environment for other software packages. Please see the relevant section in the installation guide.

Installation

Installation fails while linking

Probably your MPI compiler and/or CUDA compiler are using a C(++) compiler different from what CMake has picked up. Please see the relevant section in the installation guide.

Installation fails at "make install"

Do you have write access to the destination directory?

Aren't you trying to install into the build directory itself? In this case, you should not run make install. Look at the relevant section in the installation guide.

Import

What is the right way to import files outside the project directory?

Do avoid errors in downstream processing, follow the three rules.

  1. Do not give absolute paths to RELION.
  2. Do not import anything outside the project directory.
    You need a symbolic link to refer to files outside the project directory.
  3. Symbolic links to somewhere outside the project directory must be by an absolute path.

The cases below explain why these rules are important.

Case 1 (The intended use case)
Suppose your project is in /data/project and movies are /data/project/Movies/*.tif. They should be imported by a path relative to the project directory, Movies/*.tif, so that movie.star will contain Movies/001.tif etc. This relative path is used for all subsequent jobs to make files, such as Extract/jobXXX/Movies/001.mrcs.
Case 2 (Why the rule 1 matters)
If you import files by absolute paths, movie.star will contain /data/project/Movies/001.tif etc and subsequent jobs will create Extract/jobXXX//data/project/Movies/001.mrcs and fails. Notice the double "/"s. This is accepted in some places but not always.
Case 3 (Why the rule 2 matters)
Next, suppose your movies are in /storage/data/*.tif. You might be tempted to import them by a relative path, ../../storage/data/*.tif, but this also causes a mess. movie.star will contain ../../storage/data/001.tif and subsequent jobs will create Extract/jobXXX/../../storage/data/project/Movies/001.mrcs, which is equivalent to /data/project/storage/data/project/Movies/001.mrcs. In other words, files are written outside the job directory. If you have multiple Extract jobs, they will over-write each other!
Case 4
If you make a link from /storage/data (an absolute path) to /data/project/Movies, everything works fine in the same way as case 1.
Case 5 (Why the rule 3 matters)
If you make a link by a relative path, from ../../storage/data to /data/project/Movies, this causes a problem. To resolve aliases (e.g. Extract/256px into Extract/job010), RELION expands symbolic links in some places and you end up with the same situation as case 3. To prevent this, RELION does not expand symbolic links that starts from "/".

AutoPick

AutoPicker picks too much junks along micrograph edges

Try --skip_side N as an additional argument. This tells the program to keep N extra pixels (apart from half the particle_size) away from the edge of the micrograph.

Preprocessing

How can I group micrographs in order to have more particles per group?

You can add a column called rlnGroupName to your particles.star file. Unique rlnGroupName values define unique groups. To make this easier, there is a semi-automated grouping procedure. Note that from version-1.3, RELION implements a new image display program that can also be used to re-group particles in a more convenient manner. Select a model.star file from a previous 2D or 3D classification/refinement run, tick the"Regroup selected particles in number of groups: X" option, and provide the desired number of groups (X).

What are groups for anyway?

If you're going to group micrographs together, it is important to understand what groups are used for inside RELION. Although you may be familiar with the concept of defocus groups in other packages, RELION groups are NOT the same. Each particle has its own (possibly astigmatic) CTF model, and this is not affected by groups. Instead, for all particles inside each group RELION will estimate an average power spectrum for the noise (rlnSigma2Noise), as well as an average intensity scale factor. RELION will warn against using small numbers of particles inside a group, because these averages may become unstable. That in turn may lead to crashed runs that report errors like the sum of weights for certain particles being zero, or the scale factor being a strange number. As mentioned above, you may prevent small groups by grouping micrographs together, for example by using a semi-automated grouping procedure.

Can I also use the estimated CTF parameters from XMIPP3?

In principle yes. The conventions are the same as those in RELION. However, there is currently no script to do this automatically. In our experience, (re-)estimating CTF parameters in CTFFIND4 is sufficiently fast and robust.

How can I merge datasets, potentially with different pixel sizes?

Suppose you want to bring polished (shiny) particles from the dataset2 project into the dataset1 project.

  1. Go to the Polish directory within the dataset1 project.
  2. Make a symbolic link to dataset2's Polish/jobXXX by a full path. A link by a relative path sometimes causes unexpected problems in RELION when it tries to expand paths.
  3. Make sure the optics group name in shiny.star from the dataset2 is different from that of dataset2.
  4. Join shiny.star from dataset1 and dataset2. You don't have to import; just go above the .Nodes directory to see the file.
  5. Now the file is ready for refinement.

See also pixel size issues when merging datasets with different pixel sizes or box sizes.

If you are unlucky and the job number jobXXX is already present in the Polish directory of the dataset1 project, you have to make a link by another name. In this case, you have to replace the file names in shiny.star.

You can do similar things for other job types as long as you make all relevant paths consistent. For example, to run Polish in a new project directory, you have to make sure the paths to raw movies, the gain reference and the motion correction job are valid as well as paths to the particles and the references.

Where can I find MTF curves for typical detectors?

MTF curves for the following detectors are available in the data directory of the RELION distribution.

  • Falcon 2 at 300kV
  • Falcon 3 in electron-counting mode at 200kV
  • Falcon 3 in electron-counting mode at 300kV
  • Falcon 4 in electron-counting mode at 200kV
  • Falcon 4 in electron-counting mode at 300kV
  • DE-20 at 300kV
  • K2-summit at 300kV

Gatan provides MTF files for K2 and K3 at 200 and 300 kV.

Note that MTF depends on how electrons are rendered. The Falcon4 MTF files above are for MRC rendering by EPU. For EER, MTF might be different. Fortunately, MTF does not change the resolution of your structure.

Classification

Do you have an example of how to run 3D classification?

Yes, see the Classification example.

3D refinement

How can I make a plot of the orientational distribution of my particles?

RELION will output .bild files that can be loaded into UCSF chimera directly.

I have run RELION but get a non-sense map as a result, e.g. a spherical blob

Make sure the following things are correct:

  • You have normalized all particles to zero-mean background with a standard deviation in the noise of one.
  • Your STAR file header is correct: each label corresponds to the correct column
  • You have used the correct pixel size
  • There are no particles with very large or very small pixel values.
  • You have indicated the starting map is not on the correct greyscale (if the map does not come from RELION itself or XMIPP)

(The first three items on this list are all taken care of if you use the PreProcessing GUI

Upon restarting I get an error: incorrect table model_group_xx

Make sure all groups have at least 20-50 particles in them. The error above is likely to an empty group for one of the two independent halves of the data. Note you can join multiple micrographs into one group by giving them the same rlnMicrographName in the input STAR file. If you do this, try to join micrographs with similar defocus values and similar apparent signal-to-noise ratios.

The resolution of the RELION output map is lower than I expected

There are two answers to this. Firstly, if your expectations are based on refinement in a different program, and that program does not strictly prevent overfitting, then it might be that your expectations are wrong: perhaps the other program overfitted your data and therefore has given you a false high-resolution estimate. Secondly, we have now observed that RELION slightly underestimates resolution. However, you may still get your high-resolution map as explained on the Analyse results page. Please write to us if you genuinely believe RELION has done a bad job at refining your structure. Perhaps we may learn how to improve RELION from your case.

What should I do if my pixel size turned out to be wrong?

See Pixel_size_issues.

The FSC curve shows oscillations and/or bumps

Typically, this is caused by a wrong CTF model. For example, you have a large beam tilt (try CtfRefine with beam-tilt refinement enabled). Another possibility is that your pixel size is wrong. See Pixel_size_issues for details.

How can I create soft masks for refinement?

  1. Go to the Mask creation job. Set Lowpass filter map to 15 Å, both Extend binary map this many pixels and Add a soft-edge of this many pixels to 0 px. Set Initial binarisation threshold to 0.005.
  2. Run the job. This is very fast because it performs only low pass filtering and binarization. "Extend" and "soft-edge" take time.
  3. Open the resulted mask in RELION's image viewer. Keep the dialogue box open.
  4. Examine the slices.
  5. If you are not satisfied, adjust the threshold and press Alt-O. This changes the Continue! button into Overwrite!. Press it.
  6. Re-examine the slices. To do this, just press the Display button in the dialogue box you left open in the step 3.
  7. Repeat step 2 to 6 until you are happy.
  8. Change Extend binary map this many pixels to 3 px, Add a soft-edge of this many pixels to about 5 Å (you have to calculate the pixels). Set the Number of threads to the number of cores.
  9. Re-run and examine the slices as in step 5 and 6.
  10. Run a PostProcess job using this new mask. Does the red curve (phase randomized FSC) go to zero well before the resolution limit? If so, the mask is ready. Otherwise, increase the "soft edge" by 1 or 2 px and repeat.
  11. Finally, open the map and the mask in a molecular viewer like PyMOL or Coot and confirm the mask encloses everything you need at least at the threshold of 0.5.

Refinement or Classification crash saying "No orientation was found"

In refinement and classification, observed particle images are compared with projections of the reference of various rotations, shifts and classes. When none of the candidates yield good match, the program crashes saying:

No orientation was found as better than any other

There are many reasons for this problem. Common causes and workarounds are:

  • Particles are not normalized properly. You should always normalize particles in a particle extraction job. The diameter background circle should be more-or-less consistent with the mask diameter used in later steps. Note the units! The former is in pixels before down-sampling, while the latter is in Angstrom. If you leave this field -1 in extraction, it will be 75 % of the box size.
  • The reference map is not on absolute greyscale. In this case, you have to say "No" to the corresponding question in the Reference tab. Saying "no" when it is actually on absolute greyscale is harmless. So always say "no" when in doubt.
  • CTF parameters are extremely off. For example, you made a typo in the pixel size, amplitude contrast and/or Cs. Also make sure you choose a right answer to Invert contrast?. This should be "Yes" for cryo-EM.
  • You have too few particles per (noise-estimating) group. Note that this is different from optics groups. See the corresponding question.
  • The reference map is not right.
  • You have too many outlier particles. High contrast outliers such as detector defects, contamination of ice, ethane droplets, gold particles sometimes cause this issue. Optimize AutoPick parameters and run Class2D first. If Class2D fails, try again with Ignore CTF until first peak: Yes.
  • Particles are not well centered relative to the reference. In this case, even if you make the initial offset range larger, RELION might not be able to find the right center, because the initial standard deviation of the translation prior distribution is 10 angstroms. You can change this to X angstroms by the --offset X argument.

As a last resort, you can add --failsafe_threshold N as an additional argument to the job. This allows the program to ignore offending particles up to N times. The default is 40. This option is only available in the GPU mode.

Multi-body refinement

Do you have more detailed documentation on multi-body refinement?

Yes.

Multi-body refinement is not described in the tutorial. Instead, we have a separate Step-by-step protocol for multi-body refinement.

CTF Refinement

Beamtilt, trefoil and/or 4th order aberration estimation produces NaNs

This sometimes happens when RELION is compiled with CPU single precision mode (-DDoublePrec_CPU=OFF). We are working to fix this issue. Meanwhile, please use double precision for CPU (the default).

How can I process datasets collected with image shifts?

Because beam tilt (axial-coma) induced by image shift depends on the image shift direction and magnitude, you should refine beam tilt per image-shift position. See "Beam tilt correction" section and Fig. 4 supplement 1 in our RELION 3.0 paper. This is done by grouping particles according to the position (this group has nothing to do with image grouping for noise estimation). Typically, we group particles per hole in the acquisition template. That is, if you collect 3x3 holes per stage shift, you should divide particles into 9 groups. If you collect more than one micrograph from a hole, you may divide them further, but this is probably not necessary since the beam tilt difference between positions within a hole is small. Having less particles within a group might lead to less accurate beam tilt estimation.

To group particles in RELION 3.0, you have to add a rlnBeamTiltClass column to the particle STAR file. First, add it at the end of the header block:

_rlnNrOfSignificantSamples #22 
_rlnRandomSubset #23
_rlnBeamTiltClass #24

Note that the number might be different. Change it accordingly.

Next, assign a beam tilt class ID to each particle. In a case from SerialEM, the file name was something like movie36-3-un_108-10_0002_Feb11_17.46.35.tif and the bold numbers referred to the position.

gawk 'NF>20 {match($10, /_000(.)_/, arr); $24=1+arr[1]} {print}' Refine3D/job123/run_data.star > Refine3D/job123/run_data_grouped.star

Here, NF>20 is to focus on lines with more than 20 items (i.e. don't modify the header). $10 is the column number of rlnMicrographName (of course this depends on the file). $24 is the column number of rlnBeamTiltClass. /_000(.)_/ looks for a pattern of _000._ (a dot means any single character) and the sub-string that matches to the pattern within the brackets is stored in arr[1]. One is added to make it 1-indexed, not 0-indexed. For details, see gawk manual.

RELION 3.1 does not have rlnBeamTiltClass; the concept was generalised to Optics Group. In the beginning, you have only one optics group.

data_optics

loop_ 
_rlnOpticsGroup #1 
_rlnOpticsGroupName #2 
_rlnAmplitudeContrast #3 
_rlnSphericalAberration #4 
_rlnVoltage #5 
_rlnImagePixelSize #6 
_rlnMicrographOriginalPixelSize #7 
_rlnImageSize #8 
_rlnImageDimensionality #9 
           1  dataset1     0.100000     2.700000   300.000000     1.000000     1.000000          140            2

Duplicate lines to make nine groups. Don't forget to change rlnOpticsGroup and rlnOpticsGroupName.

data_optics

loop_ 
_rlnOpticsGroup #1 
_rlnOpticsGroupName #2 
_rlnAmplitudeContrast #3 
_rlnSphericalAberration #4 
_rlnVoltage #5 
_rlnImagePixelSize #6 
_rlnMicrographOriginalPixelSize #7 
_rlnImageSize #8 
_rlnImageDimensionality #9 
           1  position1     0.100000     2.700000   300.000000     1.000000     1.000000          140            2
           2  position2     0.100000     2.700000   300.000000     1.000000     1.000000          140            2
           3  position3     0.100000     2.700000   300.000000     1.000000     1.000000          140            2
           4  position4     0.100000     2.700000   300.000000     1.000000     1.000000          140            2
           5  position5     0.100000     2.700000   300.000000     1.000000     1.000000          140            2
           6  position6     0.100000     2.700000   300.000000     1.000000     1.000000          140            2
           7  position7     0.100000     2.700000   300.000000     1.000000     1.000000          140            2
           8  position8     0.100000     2.700000   300.000000     1.000000     1.000000          140            2
           9  position9     0.100000     2.700000   300.000000     1.000000     1.000000          140            2

Now you can use gawk similar to above and assign rlnOpticsGroup to each particle.

Computational issues

I am buying new GPUs, what do you recommend to run RELION on?

Our collaborator in Stockholm, Erik Lindahl, has made a useful blog with GPU hardware recommendations. Briefly, you'll need an NVIDIA GPU with a CUDA compute ability of at least 3.5, but you don't need the expensive double-precision NVIDIA cards, i.e. the high-end gamer cards will also do, but do see Erik's blog for details!

Note that 3D auto-refine will benefit from 2 GPUs, while 2D and 3D classification can be run just as well with 1 GPU. Apart from your GPUs you'll need a decent amount of RAM on the CPU (at least 64Gb), and you may also benefit from a fast (e.g. a 400Gb SSD!) scratch disk, especially of your working directories will be mounted over the network connecting multiple machines.

Every year new CPUs and GPUs are released and cost-efficient setup varies. See discussions on CCPEM and 3DEM mailing lists. For example, a PDF guide by Shintaro Aibara is useful.

How do I use my GPUs?

There is a good description of how to use your GPUs in our tutorial and Accelerated RELION, using GPUs or CPU-vectorization

I am buying a new cluster, what do you recommend to run RELION on?

Please look at Accelerated RELION, using GPUs or CPU-vectorization.

TODO: below is outdated; we will re-write soon.

This will of course depend on how much money you are willing to spend, and what kind of jobs you are planning to run. RELION is memory-intensive. Fortunately, it's hybrid-parallelisation allows to make use of modern clusters that consist of many multi-core nodes. In this set-up, MPI-parallelisation provides scalability across the many nodes, while threads allow to share the memory available on each of the nodes without leaving its multiple cores idle. Therefore, as long as each node has in total sufficient memory, one can always run multiple threads (and only one or a few MPI job) on each node. Therefore, RAM/node is probably a more important feature than RAM/core. The bigger the size of the boxed particles, the higher the RAM usage. For our high-resolution ribosome refinements (in boxes of ~400x400 pixels) we use somewhere between 15-25Gb of RAM per MPI process (the most expensive part in terms of RAM is the last iteration, which is done at the full image scale). We have 12-cores with 60Gb of RAM in total, so can run 2 MPI processes on each node. If you're planning to do atomic-resolution structures I wouldn't recommend buying anything that has less than 32Gb per node. Having 64Gb or more will probably keep your cluster up-to-date for longer. Then how many of those nodes you buy will probably depend on your budget (and possibly cooling limitations). We do 3.x Angstrom ribosome reconstructions from say 100-200 thousand particles in approximately two weeks using around 200-300 cores in parallel. Using more cores in parallel (e.g. 1,000) may cause serious scalability issues.

My runs keep crashing at seemingly random points in the refinement

We noticed similar problems on our Dell cluster. Things became much more stable when we switched off TCP segmentation offloading, by using the command mentioned on this linux page.

Please note that due to limited resources we cannot provide support related to high-performance computing issues. If your jobs seem to die at seemingly random points, please do not email us, but speak to your system administrator. RELION makes use of some pretty intensive high-performance computing, and setting this up satisfactorily is not always straightforward.

Although I ask for more threads, my MPI processes only take 100% or 200% CPU in top

If using OpenMPI compiled with NUMA locking support, MPI processes get bound to assigned cores automatically. This is to prevent context switches and cache misses. You can use `mpirun —bind none` to have each MPI process use multiple cores.

How can I minimise computational requirements?

See this page for an explanation about the computational requirements. Understanding these may help you make more informed decisions on how to minimise running costs. More information is also given in the 2012 JSB paper.

My calculations take forever, what can I do?

Take smaller data sets of better quality. Some people routinely collect data sets with millions of particles. This may not always be the most efficient route to a high-resolution structure. We prefer to carefully collect relatively small data sets of very high quality. This reduces computational loads and leads to relatively clean data sets to start with. If your reconstruction requires hundreds of thousands of particles (from a direct-electron detector) to get to high resolution, then something else is probably wrong. The best reason to collect millions of particles would be if there is a very large degree of structural heterogeneity in your sample. In that case: you'd better be prepared to sweat anyway. ;-)

Other questions

How can I change the pixel size and the box size of a reference and a mask?

The following command will rescale (in Fourier space) input.mrc from 1.0 Å/pix to 1.5 Å/pix, and put the new volume in a box of 200 pixel wide.

relion_image_handler --angpix 1.0 --rescale_angpix 1.5 --new_box 200 --i input.mrc --o output.mrc

If you want to rescale a mask, you need to specify --threshold_above 1 --threshold_below 0 as well, because re-sampling makes some pixel values slightly outside of the acceptable range [0, 1].