Minimise computational costs

From Relion
Jump to navigation Jump to search

By increasing the understanding of RELION's computational costs, this page intends to provide users with the information to minimise the computational costs of their MAP refinements.

Expectation

The expectation step is the "alignment" step: here each experimental image is compared to projections of the reference map in all orientations. Consequently, this is the most expensive step in terms of CPU. CPU costs increase with increasingly fine angular or translational sampling rates (linearly with the number of orientations sampled, see NrHiddenVariableSamplingPoints in the stdout file. If classifying, the CPU costs also increase linearly with the number of references used. In terms of memory (RAM): the expectation step also may be quiet costly, in particular if large images are used. Scaling behaviour is somewhat complicated, as more data is being kept in memory as the resolution increases. For Niko's recoated rotavirus data, we used 2x downsized images of 400x400 pixels, which still fitted into our 8x2Gb machines. Using the original 800x800 pixel images did not.

If you ever get the following error in your stderr file: Allocate: No space left then you know you've run out of memory. In that case, first check whether you are running the intended number of MPI jobs on each node. You can monitor memory usage and number of MPI jobs on your nodes by logging into them and using the "top" command. Often, it takes some work to setup your job submission system for handling hybrid parallelization, i.e. jobs that use both MPI and threads. See the installation page for more details on how to do this.


The total wallclock time needed to run the expectation step may be greatly reduced using parallel computing. This has been implemented at 2 different levels: MPI (message passing interface) is used to communicate between different computing nodes (separate computers that are connected to each other using cables), while so-called threads are used to parallelize tasks among the multiple cores of modern multi-core computers. Threads have the advantage of sharing the memory of one computer (so that memory does not need to be replicated for each thread). MPI has the advantage of scalability: one can always buy more computers and links them together in a larger cluster, while there is a maximum on the number of cores on one computer one can buy. The recommended way to run RELION (in particular for 3D refinements where memory requirements are larger than in 2D) is to use as many threads as there are cores on your nodes, and then run one MPI process on each node. For the 3D auto-refine option, be aware that the two independent half data sets are refined on two half-sets of the slaves, while a single master node directs everything. Therefore, it is most efficient to use an odd number of nodes, and the minimum number of nodes to use is 3.

Maximization

The maximization step is the "reconstruction" step. This step is typically much faster than the expectation step. However, it is not parallelized very well. The only parallelization implemented is that multiple reconstructions (e.g. in case of classification, or the two independent reconstructions for gold-standard FSCs) are performed in parallel. Implementation of threads in the FFTW library yielded limited speed-ups in release 1.1, but this implementation was removed from release 1.2 due to instabilities.