Classification example
Download data and reference
Download the test data set comprising 10,000 ribosome images as deposited by Joachim Frank at the EBI-EMDB from here. The corresponding metadata is stored in this PDF file. Save both files in your working directory.
We will use EMDB entry 1056 as a reference, save this file in the same working directory. Note this reference has the same pixel size (2.8A) and the same box size (130x130) as the data set at hand. Therefore, no re-scaling or windowing operations are necessary.
Unpack the data as follows:
tar -xf J-Frank_70s_real_data.tar gunzip emd_1056.map.gz mv emd_1056.map emd_1056.mrc
Prepare the input STAR file
From the PDF provided by Joachim Frank we select the following (14) lines and save them in a text file called defocus.dat
1 3 1347.0 1347.0 21580. 2 3 505.00 1852.0 24833. 3 3 989.00 2841.0 26450. 4 3 857.00 3698.0 28320. 5 3 475.00 4173.0 30993. 6 3 349.00 4522.0 33150. 7 3 478.00 5000.0 34588. 8 3 1242.0 6242.0 21580. 9 3 713.00 6955.0 24833. 10 3 1255.0 8210.0 26450. 11 3 1022.0 9232.0 28320. 12 3 304.00 9536.0 30993. 13 3 232.00 9768.0 33150. 14 3 232.00 10000. 34588.
Then save the following lines as a file called make_star.csh
#!/usr/bin/env csh
ls -l win/*dat | awk '{print $NF}' >imagelist
relion_star_loopheader rlnImageName rlnMicrographName rlnDefocusU rlnVoltage rlnSphericalAberration rlnAmplitudeContrast > all_images.star
set ngr = 14
set gr = 0
while ($gr < $ngr)
@ gr++
set nn=`head -n $gr defocus.dat | tail -1 | awk '{print int($3)}'`
set tot=`head -n $gr defocus.dat | tail -1 | awk '{print int($4)}'`
set def=`head -n $gr defocus.dat | tail -1 | awk '{print $5}'`
head -n ${tot} imagelist | tail -n ${nn} |awk -v"def=$def" -v"gr=$gr" '{print $1, gr, def, 200, 2, 0.1}' >> all_images.star
end
And execute it to generate the input STAR file with all image names and CTF information, using the command:
csh make_star.csh