How to compile and install RELION-3.1 on CentOS 8.1
2022-05-10 update: this post is now mostly outdated. The general strategy of
installing in /opt
and using module files is still relevant, but the specific
details and versions mentioned below are outdated. In addition, CentOS
transitioned to an
update schedule that is out of sync with the Nvidia driver. I now use Rocky
Linux for cryoEM computing, and I also no longer
recommend installing the Nvidia driver with the “runfile” installer: it is much
easier to use the RPM package provided by Nvidia (the CUDA toolkit, however, is
best installed in /opt
using the “runfile”). You can find up-to-date
installation instructions for RELION here.
The lab I work at recently acquired a GPU workstation on which I had to install RELION, a program for processing cryoEM data. Since this is not a straightforward procedure, I took some notes in case I need to do this again in the future. I decided to also post these notes here, in case they can help anyone else.
These notes apply to RELION-3.1 and CentOS 8.1. Usual disclaimers apply: backup
your data before modifying your system, don’t run commands you don’t fully
understand (especially so if they require to be run with sudo
), follow these
directions at your own risk, and I am in no way responsible if you mess up your
system in the course of following these directions. Also, I make no commitment
to keep these notes up-to-date with future versions of RELION or CentOS, and I
don’t have time to offer individual help. So please don’t email me questions,
email the CCPEM list instead for questions specific to RELION, or
seek help from your distribution’s specific channels for general Linux questions.
Conventions
All commands listed below should be run as a normal user, not as root.
Commands that require administrator permissions (like installing packages with
the system package manager) are prepended with sudo
, which assumes the user
account you run these commands from is in the administrator group (wheel
).
Programs not installed with the system package manager will be installed in
their own directory /opt/<program>-X.Y.Z
, in which <program>
is the
program’s name in lowercase and X.Y.Z
is its version number. This will only
work if the user performing the installation has write permission to /opt
,
which is safe to do because /opt
is a location reserved for programs that are
not part of the system distribution (there is nothing to break there,
because the system doesn’t put any of its files there). This has several
advantages over making a package for the distribution or installing in
/usr/local
:
- the biggest advantage is that we can have several versions of the same program
installed at the same time: we can only have one
relion
binary under/usr/local
, but we can have several/opt/relion-X.Y.Z
with different version numbers happily living next to each other (sometimes, we need to revisit old results obtained with an earlier version of the program not necessarily compatible with the current version, so this is a true practical advantage); - installing a program is easy: in most cases,
./configure --prefix=/opt/<program>-X.Y.Z ; make ; make install
will work just fine (as indicated above, without any risk of messing up the base system if run as a regular user), which is easier than making an RPM or DEB or what-have-you package compliant with all of your distribution’s packaging rules; - it is easy to check what takes up storage space, with
du -sh /opt/*
; - uninstalling a program is as easy as
rm -r /opt/<program>-X.Y.Z
, since all of a program’s files are under a single directory, instead of scattered across subdirectories under/usr/local
.
Now, /usr/local/bin
is in users’ PATH
by default, while this is not the case
for arbitrary directories under /opt
. How do we make our custom-built programs
accessible from the shell with minimal configuration for our users? Obviously,
having every user edit their ~/.bashrc
file is not a viable option: they would
have to do that every time they want to change which version of a program they
use, and this is error prone. The solution is to use the Environment
Modules system, installed by default on CentOS 8.1. This allows us to
write modulefiles that will correctly set up environment variables for each
specific program. We will store these modulefiles under /opt/modulefiles
, and
append this path to $MODULESHOME/init/.modulespath
so the module
commands
can use our custom modulefiles. The file $MODULESHOME/init/.modulespath
initially looks something like this:
# This file defines the initial setup for the modulefiles search path
# Each line containing one or multiple paths delimited by ':' will be
# added to the MODULEPATH environment variable.
/usr/share/Modules/modulefiles:/etc/modulefiles:/usr/share/modulefiles
Edit it with sudo vi $MODULESHOME/init/.modulespath
to add /opt/modulefiles
to the $MODULEPATH
variable. The file should now look like this:
# This file defines the initial setup for the modulefiles search path
# Each line containing one or multiple paths delimited by ':' will be
# added to the MODULEPATH environment variable.
/usr/share/Modules/modulefiles:/etc/modulefiles:/usr/share/modulefiles:/opt/modulefiles
Our users can now list all available modules on the system (module avail
)
and easily set their shell environment to use what they need
(module load <program/X.Y.Z>
).
Requirements
To compile RELION, we need to install development tools:
sudo dnf group install "Development Tools"
sudo dnf install cmake
If you wonder what comes with the “Development Tools” group, you can find out with the following command:
dnf group info "Development Tools"
And we need the following libraries:
sudo dnf install fftw-devel fltk-devel libX11-devel libtiff-devel libpng-devel freetype-devel
The following may not be necessary for RELION, but some libraries are in the PowerTools repository, so we might need to activate it:
sudo dnf config-manager --set-enabled PowerTools
And other required packages are in EPEL:
sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
Choose a version of CUDA
Compiling and running RELION requires the CUDA Toolkit and libraries. The easiest way to compile RELION-3.1 on CentOS 8.1 is to use CUDA 10.2, which supports the version of GCC (8.3.1) that comes with CentOS 8.1. However, RELION can use external programs for motion correction and CTF estimation, which are not open source and depend on different versions of CUDA. The trade-offs, at the time I am writing this, go as follows:
- Compile RELION with CUDA 10.2. This is the easiest way to go and has the least number of pre-requisites. For motion correction, you will be able to use either MotionCor2 (not open source, but version 1.3.1 ships a binary compiled with CUDA 10.2) or RELION’s own motion correction program (not GPU-accelerated). For CTF estimation, you will be limited to CTFFIND4 (not GPU-accelerated), since Gctf version 1.18 does not ship a binary compiled with CUDA 10.2 (and likely won’t because it seems not maintained anymore; it is also not open source, preventing one from compiling it with one’s preferred version of CUDA).
- Compile RELION with CUDA 9.2. This is a bit more difficult because CUDA 9.2 is not compatible with GCC 8.3.1, which is the default compiler on CentOS 8.1: for CUDA 9.2 to work, you will therefore have to install a version of GCC earlier than version 7. The advantage of using CUDA 9.2 is that you can then have all programs working in the same environment: MotionCor2 (version 1.3.1 ships a binary built with CUDA 9.2), Gctf (version 1.18 also ships a binary built with CUDA 9.2), RELION’s own motion correction program (independent of CUDA because not GPU-accelerated) and CTFFIND4 (also independent of CUDA because not GPU-accelerated). Even though this is a bit more work for the system administrator, this is an easier setup for the users since they will be able to use any combination of these programs in a single shell.
- Compile two copies of RELION, one with CUDA 9.2 and one with CUDA 10.2. This is as easy (or as difficult…) as both options above, but possible when using the environment modules system. This way, one can use the RELION compiled with CUDA 10.2 for everything, and only use the one compiled with CUDA 9.2 to run Gctf. This requires changing the environment to choose which version of RELION and CUDA should be used at run time, but this is easy with the environment modules system.
I wanted to go with option 1, but the CTFFIND4 binaries downloaded from its
website don’t run on CentOS 8.1 (they crash with a segmentation fault, and
inspecting them with file ctffind
reports they were compiled for a Linux 2.6
kernel, which is several versions older than the kernel version in CentOS 8.1).
Since this program is open source, I tried to compile it. It compiled fine and
runs long enough to interactively pass all parameter prompts, but then crashes
with a segmentation fault. This may be due to the fact that I compiled it with
GCC, as a somewhat old email on the CCPEM list suggests that it should
be compiled with ICC. So I chose option 3.
Install CUDA 10.2
The following commands will download the installer from Nvidia’s website and install only the CUDA Toolkit (not the driver):
cd ~/Downloads
wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run
chmod +x cuda_10.2.89_440.33.01_linux.run
./cuda_10.2.89_440.33.01_linux.run --silent --toolkit --toolkitpath=/opt/cuda-10.2
Then place this module file in /opt/modulefiles/cuda/10.2
:
#%Module1.0
proc ModulesHelp { } {
global dotversion
puts stderr "\tCUDA Libraries and Toolkit, version 10.2"
}
module-whatis "CUDA Libraries and Toolkit. Documentation: https://docs.nvidia.com/cuda/archive/10.2/"
conflict cuda
set program cuda
set version 10.2
set prefix /opt/$program-$version
prepend-path PATH $prefix/bin
prepend-path LD_LIBRARY_PATH $prefix/lib64
prepend-path CPATH $prefix/include
prepend-path C_INCLUDE_PATH $prefix/include
prepend-path CPLUS_INCLUDE_PATH $prefix/include
prepend-path INCLUDE $prefix/include
setenv CUDA_HOME $prefix
Install CUDA 9.2
For this, we first need to install a version of GCC earlier than version 7. I chose to go for the latest version in the 6.x series, which is 6.5.0. Compiling GCC requires the following system packages:
sudo dnf install gmp-devel mpfr-devel libmpc-devel
And here is how to download GCC 6.5.0’s source, compile and install it (you can of course choose a different mirror closer to you):
cd ~/Downloads
wget ftp://ftp.uvsq.fr/pub/gcc/releases/gcc-6.5.0/gcc-6.5.0.tar.gz
tar -xf gcc-6.5.0.tar.gz
cd gcc-6.5.0
mkdir build
cd build
../configure --prefix=/opt/gcc-6.5.0 --disable-multilib
make
make install-strip
Finally, save this module file as /opt/modulefiles/gcc/6.5.0
:
#%Module1.0
proc ModulesHelp { } {
global dotversion
puts stderr "\tGNU Compiler Collection, version 6.5.0"
}
module-whatis "GNU Compiler Collection. Documentation: https://www.gnu.org/software/gcc/"
set program gcc
set version 6.5.0
set prefix /opt/$program-$version
prepend-path PATH $prefix/bin
prepend-path LD_LIBRARY_PATH $prefix/lib
prepend-path LD_LIBRARY_PATH $prefix/lib64
prepend-path CPATH $prefix/include
prepend-path C_INCLUDE_PATH $prefix/include
prepend-path CPLUS_INCLUDE_PATH $prefix/include
# Take higher priority than system CC
setenv CC $prefix/bin/gcc
setenv CXX $prefix/bin/g++
We can then install CUDA 9.2:
cd ~/Downloads
wget https://developer.nvidia.com/compute/cuda/9.2/Prod2/local_installers/cuda_9.2.148_396.37_linux
chmod +x cuda_9.2.148_396.37_linux
./cuda_9.2.148_396.37_linux --silent --toolkit --toolkitpath=/opt/cuda-9.2
Let’s also install the patch:
module purge
module load gcc/6.5.0
cd ~/Downloads
wget https://developer.nvidia.com/compute/cuda/9.2/Prod2/patches/1/cuda_9.2.148.1_linux
chmod +x cuda_9.2.148.1_linux
./cuda_9.2.148.1_linux --silent --accept-eula --installdir=/opt/cuda-9.2
And finally, save this module file as /opt/modulefiles/cuda/9.2
:
#%Module1.0
proc ModulesHelp { } {
global dotversion
puts stderr "\tCUDA Libraries and Toolkit, version 9.2"
}
module-whatis "CUDA Libraries and Toolkit. Documentation: https://docs.nvidia.com/cuda/archive/9.2/"
conflict cuda
set program cuda
set version 9.2
set prefix /opt/$program-$version
prepend-path PATH $prefix/bin
prepend-path LD_LIBRARY_PATH $prefix/lib64
prepend-path CPATH $prefix/include
prepend-path C_INCLUDE_PATH $prefix/include
prepend-path CPLUS_INCLUDE_PATH $prefix/include
prepend-path INCLUDE $prefix/include
setenv CUDA_HOME $prefix
Install OpenMPI 3.1.6
RELION also requires OpenMPI. I first tried to use the RPM package:
sudo dnf install openmpi-devel
I managed to compile RELION with this OpenMPI, which happens to be version
4.0.1, but then I got segmentation faults at run time, making this build of
RELION essentially useless (parallelization with OpenMPI is used for pretty
much everything in RELION). I investigated this error and found that cmake
had
picked up an OpenMPI version 3.1:
-- Found MPI_C: /usr/lib64/openmpi/lib/libmpi.so (found version "3.1")
I still don’t understand how this is possible, since the RPM package did not install anything from version 3.1:
$ ls -l /usr/lib64/openmpi/lib/ | grep libmpi.so
lrwxrwxrwx. 1 root root 17 Nov 21 2019 libmpi.so -> libmpi.so.40.20.1
lrwxrwxrwx. 1 root root 17 Nov 21 2019 libmpi.so.40 -> libmpi.so.40.20.1
-rwxr-xr-x. 1 root root 2422280 Nov 21 2019 libmpi.so.40.20.1
But then the mpirun
used at run time was definitely 4.0.1, and that caused
problems:
$ mpirun --version
mpirun (Open MPI) 4.0.1
I asked about this problem on the CCPEM list, and from the answer I got I understand that using different versions of OpenMPI during compilation and at run time definitely cause this kind of problem. But what is still not clear to me is whether RELION would have worked fine with OpenMPI 4.0.1, had it been correctly compiled and run with this version, or whether it is only compatible with OpenMPI 3.x versions. So, I installed OpenMPI 3.1.6 (newest version in the 3.x series) from source:
cd ~/Downloads
wget https://download.open-mpi.org/release/open-mpi/v3.1/openmpi-3.1.6.tar.bz2
tar -xf openmpi-3.1.6.tar.bz2
cd openmpi-3.1.6
mkdir build
cd build
../configure --prefix=/opt/openmpi-3.1.6
make
make install-strip
I then adapted the module file provided by the RPM package to make one for this
version of OpenMPI. I stored this module file in
/opt/modulefiles/openmpi/3.1.6
:
#%Module 1.0
#
# OpenMPI module for use with 'environment-modules' package:
#
conflict mpi
prepend-path PATH /opt/openmpi-3.1.6/bin
prepend-path LD_LIBRARY_PATH /opt/openmpi-3.1.6/lib
prepend-path PKG_CONFIG_PATH /opt/openmpi-3.1.6/lib/pkgconfig
prepend-path MANPATH /opt/openmpi-3.1.6/share/man
setenv MPI_BIN /opt/openmpi-1.3.6/bin
setenv MPI_SYSCONFIG /opt/openmpi-3.1.6/etc
setenv MPI_FORTRAN_MOD_DIR /usr/lib64/gfortran/modules/openmpi
setenv MPI_INCLUDE /opt/openmpi/3.1.6/include
setenv MPI_LIB /opt/openmpi-3.1.6/lib
setenv MPI_MAN /opt/openmpi-3.1.6/share/man
setenv MPI_PYTHON_SITEARCH /usr/lib64/python3.6/site-packages/openmpi
setenv MPI_PYTHON2_SITEARCH /usr/lib64/python3.6/site-packages/openmpi
setenv MPI_PYTHON3_SITEARCH /usr/lib64/python3.6/site-packages/openmpi
setenv MPI_COMPILER openmpi-x86_64
setenv MPI_SUFFIX _openmpi
setenv MPI_HOME /opt/openmpi-3.1.6
Install RELION
After many detours, we finally have all we need to compile and install RELION. We get the source code by cloning the git repository the first time:
mkdir ~/software
cd ~/software
git clone https://github.com/3dem/relion
Next time you need to update it, pull the new changes:
cd ~/software/relion
git pull
Compile RELION with CUDA 10.2
The following commands will configure and compile RELION-3.1 (at the time I
wrote these notes, it was commit 5997001f75
) with CUDA 10.2.
Change the value of -DCUDA_ARCH=
to adapt to your GPU: this is the “compute
capability” listed here, without the dot (choose
the highest one supported by both your GPU and the version of CUDA you’re using;
if you don’t specify this option, cmake
seems to default to a very low compute
capability, which is compatible with more different combinations of GPU and
CUDA version but means you won’t take full advantage of your specific GPU):
cd ~/software/relion
git checkout ver3.1
mkdir build_cuda-10.2
cd build_cuda-10.2
module purge
module load openmpi/3.1.6 cuda/10.2
cmake -DCMAKE_INSTALL_PREFIX=/opt/relion-3.1_cuda-10.2 -DCUDA_ARCH=75 ..
make
To install it, make sure the destination directory exists:
mkdir /opt/relion-3.1_cuda-10.2
And then run:
make install
Finally, save this module file as /opt/modulefiles/relion/3.1_cuda-10.2
. There
are more environment variables you can set based on your specific system, you
can read about it in RELION’s documentation.
#%Module1.0
proc ModulesHelp { } {
global dotversion
puts stderr "\tRELION, version 3.1 (CUDA 10.2)"
}
module-whatis "2D classification, 3D classification and 3D refinement. Documentation: https://www3.mrc-lmb.cam.ac.uk/relion/index.php/Main_Page"
module load openmpi/3.1.6 cuda/10.2 motioncor2/1.3.1 ctffind/4.1.14
conflict relion
prereq openmpi/3.1.6
prereq cuda/10.2
prereq motioncor2
prereq ctffind
set program relion
set version 3.1_cuda-10.2
set prefix /opt/$program-$version
# Where to find other programs
setenv RELION_MOTIONCOR2_EXECUTABLE MotionCor2_v1.3.1-Cuda102
setenv RELION_CTFFIND_EXECUTABLE ctffind
setenv RELION_RESMAP_EXECUTABLE ResMap
setenv RELION_PDFVIEWER_EXECUTABLE evince
setenv RELION_QSUB_TEMPLATE $prefix/bin/qsub.csh
# MPI and threads settings
# Ask for confirmation if users try to submit local jobs with more than 9 MPI processes. Rationale: 9 MPIs means 1 coordinator + 4GPUs x 2 workers.
setenv RELION_WARNING_LOCAL_MPI 9
# It doesn't help to overbook the GPUs too much. 13 MPIs means 1 coordinator + 4GPUs x 3 workers.
# But some programs like CTFFIND and RELION's MotionCor run on CPUs, so the hard limit on MPI processes should be half the CPU cores.
setenv RELION_MPI_MAX 40
setenv RELION_ERROR_LOCAL_MPI 41
# Shell to launch other programs from
setenv RELION_SHELL bash
# Scratch location
setenv RELION_SCRATCH_DIR /scratch/
prepend-path PATH $prefix/bin
prepend-path LD_LIBRARY_PATH $prefix/lib
Compile RELION with CUDA 9.2
The following commands will configure and compile RELION-3.1 (at the time I
wrote these notes, it was commit 5997001f75
) with CUDA 9.2.
Change the value of -DCUDA_ARCH=
to adapt to your GPU: this is the “compute
capability” listed here, without the dot (choose
the highest one supported by both your GPU and the version of CUDA you’re using;
if you don’t specify this option, cmake
seems to default to a very low compute
capability, which is compatible with more different combinations of GPU and
CUDA version but means you won’t take full advantage of your specific GPU):
cd ~/software/relion
git checkout ver3.1
mkdir build_cuda-9.2
cd build_cuda-9.2
module purge
module load openmpi/3.1.6 cuda/9.2 gcc/6.5.0
cmake -DCMAKE_INSTALL_PREFIX=/opt/relion-3.1_cuda-9.2 -DCUDA_ARCH=72 ..
make
To install it, make sure the destination directory exists:
mkdir /opt/relion-3.1_cuda-9.2
And then run:
make install
Finally, save this module file as /opt/modulefiles/relion/3.1_cuda-9.2
. There
are more environment variables you can set based on your specific system, you
can read about it in RELION’s documentation.
#%Module1.0
proc ModulesHelp { } {
global dotversion
puts stderr "\tRELION, version 3.1 (CUDA 9.2)"
}
module-whatis "2D classification, 3D classification and 3D refinement. Documentation: https://www3.mrc-lmb.cam.ac.uk/relion/index.php/Main_Page"
module load openmpi/3.1.6 cuda/9.2 motioncor2/1.3.1 gctf/1.18b2
conflict relion
prereq openmpi/3.1.6
prereq cuda/9.2
prereq motioncor2
prereq gctf
#prereq ctffind
set program relion
set version 3.1_cuda-9.2
set prefix /opt/$program-$version
# Where to find other programs
setenv RELION_MOTIONCOR2_EXECUTABLE MotionCor2_v1.3.1-Cuda92
setenv RELION_GCTF_EXECUTABLE Gctf_v1.18_b2_sm70_cu9.2
#setenv RELION_CTFFIND_EXECUTABLE ctffind
setenv RELION_RESMAP_EXECUTABLE ResMap
setenv RELION_PDFVIEWER_EXECUTABLE evince
setenv RELION_QSUB_TEMPLATE $prefix/bin/qsub.csh
# MPI and threads settings
# Ask for confirmation if users try to submit local jobs with more than 9 MPI processes. Rationale: 9 MPIs means 1 coordinator + 4GPUs x 2 workers.
setenv RELION_WARNING_LOCAL_MPI 9
# It doesn't help to overbook the GPUs too much. 13 MPIs means 1 coordinator + 4GPUs x 3 workers.
# But some programs like CTFFIND, Gctf, MotionCor2 and RELION's MotionCor run on CPUs, so the hard limit on MPI processes should be half the CPU cores.
setenv RELION_MPI_MAX 40
setenv RELION_ERROR_LOCAL_MPI 41
# Shell to launch other programs from
setenv RELION_SHELL bash
# Scratch location
setenv RELION_SCRATCH_DIR /scratch/
prepend-path PATH $prefix/bin
prepend-path LD_LIBRARY_PATH $prefix/lib
Running RELION
Now, the command module avail
should list all these new module files (I did
not show the module files for MotionCor2, Gctf and CTFFIND; those simply contain
the same header up to module-whatis
, followed by a single prepend-path
directive to indicate where to find the binary):
$ module avail
---/opt/modulefiles ---
cuda/9.2
cuda/10.2
ctffind/4.1.14
gcc/6.5.0
gctf/1.18b2
motioncor2/1.3.1
openmpi/3.1.6
relion/3.1_cuda-9.2
relion/3.1_cuda-10.2
You can now get one version of RELION or the other on your path like so:
$ module list
No Modulefiles Currently Loaded.
$ which relion
/usr/bin/which: no relion in (/home/guillaume/.local/bin:/home/guillaume/bin:/opt/miniconda3/condabin:/home/guillaume/.local/bin:/home/guillaume/bin:/usr/share/Modules/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin)
$ module load relion/3.1_cuda-9.2
$ module list
Currently Loaded Modulefiles:
1) openmpi/3.1.6 2) cuda/9.2 3) motioncor2/1.3.1 4) gctf/1.18b2 5) relion/3.1_cuda-9.2
$ which relion
/opt/relion-3.1_cuda-9.2/bin/relion
$ module purge
$ module list
No Modulefiles Currently Loaded.
$ module load relion/3.1_cuda-10.2
$ module list
Currently Loaded Modulefiles:
1) openmpi/3.1.6 2) cuda/10.2 3) motioncor2/1.3.1 4) ctffind/4.1.14 5) relion/3.1_cuda-10.2
$ which relion
/opt/relion-3.1_cuda-10.2/bin/relion