Software

We provide a basic set of toolchains and some common libraries via modules.

To build common HPC software packages, we provide a central installation of the Spack package manager. A detailed guide on how to use this installation can be found in the Spack Usage Guide.

Modules

We use the Lmod module system:

module available (shortcut ml av) lists available modules
module load loads selected modules
module list shows modules currently loaded in your environment

There are also hidden modules that are generally less relevant to users but can be viewed with ml --show_hidden av.

We are committed to providing the tools you need for your research and development efforts. If you require modules that are not listed here or need different versions, please contact our support team, and we will be happy to assist you.

Compilers

GCC 11.4.1, default on the system.
GCC 13.2.0.
AOCC 4.2.0, AMD Optimizing C/C++ Compiler
Intel OneAPI Compilers: 2024.1.0
Intel Classic Compilers: 2021.10.0
NVHPC 24.7, NVIDIA HPC Compilers

MPI Libraries

OpenMPI 4.1.6
OpenMPI 5.0.3 (default).
MPICH 4.2.1
Intel OneAPI MPI 2021.12.1.

Mathematical Libraries

AMD Math Libraries:
AMD BLIS, BLAS-like libraries.
AMD FFTW, a fast Fourier transform library.
AMD libFLAME, a library for dense matrix computations. (LAPACK)
AMD ScaLAPACK.
HDF5: Version 1.14.3 (built with MPI)
Boost 1.85.0

Programming Languages

Julia 1.10.2, a high-level, high-performance dynamic programming language.
R 4.4.0, for statistical computing.
Python 3.11.7

Tools and Utilities

CUDA Toolkit 12.6.1
GDB (GNU Debugger)
Apptainer

Spack

We use the Spack package manager to provide a collection of common HPC software packages. This page explains how to use the central Spack installation to build your own modulefiles.

Quick Setup with `rub-deploy-spack-configs`

You can directly copy the configuration files described in Central Spack Installation (upstreams.yaml, config.yaml, modules.yaml, compilers.yaml) to your home directory using the rub-deploy-spack-configs command:

rub-deploy-spack-configs

Add these lines to your ~/.bashrc to activate spack with every login:

export MODULEPATH=$MODULEPATH:$HOME/spack/share/spack/lmod/linux-almalinux9-x86_64/Core
. /cluster/spack/0.23.0/share/spack/setup-env.sh

Guide to Using Spack

Below is a detailed guide on how to effectively use Spack.

Searching for Packages

To find available packages, use:

spack list <keyword>  # Search for packages by name
# Example:
spack list openfoam
openfoam  openfoam-org
==> 2 packages

For detailed information about a package:

spack info <package>  # Show versions, variants, and dependencies
# Example:
spack info hdf5

For a quick search for all available packages in spack, visit https://packages.spack.io/.

Viewing Package Variants

Variants are build options that enable or disable features. List them with spack info <package>:

spack info hdf5

Output includes:

Preferred version:  
    1.14.3           https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.14/hdf5-1.14.3/src/hdf5-1.14.3.tar.gz

Safe versions:  
    1.14.3           https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.14/hdf5-1.14.3/src/hdf5-1.14.3.tar.gz
    1.14.2           https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.14/hdf5-1.14.2/src/hdf5-1.14.2.tar.gz
    1.14.1-2         https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.14/hdf5-1.14.1-2/src/hdf5-1.14.1-2.tar.gz
    1.14.0           https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.14/hdf5-1.14.0/src/hdf5-1.14.0.tar.gz
    1.12.3           https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.12/hdf5-1.12.3/src/hdf5-1.12.3.tar.gz
    1.12.2           https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.12/hdf5-1.12.2/src/hdf5-1.12.2.tar.gz
    1.12.1           https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.12/hdf5-1.12.1/src/hdf5-1.12.1.tar.gz
    1.12.0           https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.12/hdf5-1.12.0/src/hdf5-1.12.0.tar.gz
    1.10.11          https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.10/hdf5-1.10.11/src/hdf5-1.10.11.tar.gz
    1.10.10          https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.10/hdf5-1.10.10/src/hdf5-1.10.10.tar.gz

Variants:
    api [default]               default, v110, v112, v114, v116, v16, v18
        Choose api compatibility for earlier version
    cxx [false]                 false, true
        Enable C++ support
    fortran [false]             false, true
        Enable Fortran support
    hl [false]                  false, true
        Enable the high-level library
    mpi [true]                  false, true
        Enable MPI support

Defaults are shown in square brackets, possible values to the right.

Checking the Installation

To see which dependencies will be installed, use:

spack spec hdf5

Output includes:

Input spec
--------------------------------
 -   hdf5

Concretized
--------------------------------
[+]  hdf5@1.14.3%gcc@11.4.1~cxx~fortran+hl~ipo~java~map+mpi+shared~subfiling~szip~threadsafe+tools api=default build_system=cmake build_type=Release generator=make patches=82088c8 arch=linux-almalinux9-zen4
[+]      ^cmake@3.27.9%gcc@11.4.1~doc+ncurses+ownlibs build_system=generic build_type=Release arch=linux-almalinux9-zen4
[+]          ^curl@8.7.1%gcc@11.4.1~gssapi~ldap~libidn2~librtmp~libssh+libssh2+nghttp2 build_system=autotools libs=shared,static tls=mbedtls,openssl arch=linux-almalinux9-zen4
[+]              ^libssh2@1.11.0%gcc@11.4.1+shared build_system=autotools crypto=mbedtls patches=011d926 arch=linux-almalinux9-zen4
[+]                  ^xz@5.4.6%gcc@11.4.1~pic build_system=autotools libs=shared,static arch=linux-almalinux9-zen4
[+]              ^mbedtls@2.28.2%gcc@11.4.1+pic build_system=makefile build_type=Release libs=shared,static arch=linux-almalinux9-zen4
[+]              ^nghttp2@1.52.0%gcc@11.4.1 build_system=autotools arch=linux-almalinux9-zen4
[+]                  ^diffutils@3.10%gcc@11.4.1 build_system=autotools arch=linux-almalinux9-zen4
[+]              ^openssl@3.3.0%gcc@11.4.1~docs+shared build_system=generic certs=mozilla arch=linux-almalinux9-zen4
[+]                  ^ca-certificates-mozilla@2023-05-30%gcc@11.4.1 build_system=generic arch=linux-almalinux9-zen4
[+]          ^ncurses@6.5%gcc@11.4.1~symlinks+termlib abi=none build_system=autotools patches=7a351bc arch=linux-almalinux9-zen4
[+]      ^gcc-runtime@11.4.1%gcc@11.4.1 build_system=generic arch=linux-almalinux9-zen4
[e]      ^glibc@2.34%gcc@11.4.1 build_system=autotools arch=linux-almalinux9-zen4
[+]      ^gmake@4.4.1%gcc@11.4.1~guile build_system=generic arch=linux-almalinux9-zen4
[+]      ^openmpi@5.0.3%gcc@11.4.1~atomics~cuda~gpfs~internal-hwloc~internal-libevent~internal-pmix~java+legacylaunchers~lustre~memchecker~openshmem~orterunprefix~romio+rsh~static+vt+wrapper-rpath build_system=autotools fabrics=ofi romio-filesystem=none schedulers=slurm arch=linux-almalinux9-zen4

It’s always a good idea to check the specs before installing.

Enabling/Disabling Variants

Control variants with + (enable) or ~ (disable):

spack install hdf5 +mpi +cxx ~hl  # Enable MPI and C++, disable high-level API

For packages with CUDA, use compute capabilities 8.0 (for GPU nodes) and 9.0 (for FatGPU nodes):

spack install openmpi +cuda cuda_arch=80,90

Specifying Compilers

Use % to specify a compiler. Check available compilers with:

spack compilers

Example:

spack install hdf5 %gcc@11.4.1

When using compilers other than GCC 11.4.1, dependencies must also be built with that compiler:

spack install --fresh hdf5 %gcc@13.2.0

Specifying Dependencies

Use ^ to specify dependencies with versions or variants:

spack install hdf5 ^openmpi@4.1.5

Dependencies can also have variants:

spack install hdf5 +mpi ^openmpi@4.1.5 +threads_multiple

Make sure to set the variants for the package and the dependencies on the right position or installation will fail.

Putting It All Together

Combine options for customized installations:

spack install hdf5@1.14.3 +mpi ~hl ^openmpi@4.1.5 +cuda cuda_arch=80,90 %gcc@11.4.1

Compiles hdf 1.14.3 with GCC 11.4.1.
Enables MPI support, disable high-level API.
Uses OpenMPI 4.1.5 with cuda support as a dependency.

Building and Adding a New Compiler

Install a new compiler (e.g., GCC 13.2.0) with:

spack install gcc@13.2.0

Add it to Spack’s compiler list:

spack compiler add $(spack location -i gcc@13.2.0)

Verify it’s recognized:

spack compilers

Use it to build packages:

spack install --fresh hdf5 %gcc@13.2.0

Comparing installed package variants

If you have multiple installations of the same package with different variants, you can inspect their configurations using Spack’s spec command or the find tool.

List installed packages with variants

Use spack find -vl to show all installed variants and their hashes:

spack find -vl hdf5

-- linux-almalinux9-zen4 / gcc@11.4.1 ---------------------------
amrsck6 hdf5@1.14.3~cxx~fortran+hl~ipo~java~map+mpi+shared~subfiling~szip~threadsafe+tools api=default build_system=cmake build_type=Release generator=make patches=82088c8

2dsgtoe hdf5@1.14.3+cxx+fortran~hl~ipo~java~map+mpi+shared~subfiling~szip~threadsafe+tools api=default build_system=cmake build_type=Release generator=make patches=82088c8

==> 2 installed packages

amrsck6 and 2dsgtoe are the unique hashes for each installation.
You can see one package uses +hl while the other does not.

Inspect specific installations

Use spack spec /<hash> to view details of a specific installation:

spack spec /amrsck6  
spack spec /2dsgtoe

Compare two installations

To compare variants between two installations, use spack diff with both hashes:

spack diff /amrsck6 /2dsgtoe

You will see a diff in the style of git:

--- hdf5@1.14.3/amrsck6mml43sfv4bhvvniwdydaxfgne
+++ hdf5@1.14.3/2dsgtoevoypx7dr45l5ke2dlb56agvz4
@@ virtual_on_incoming_edges @@
-  openmpi mpi
+  mpich mpi

So one version depends on OpenMPI while the other depends on MPICH.

Removing Packages

For multiple variants of a package, specify the hash:

spack uninstall /amrsck6

Central Spack Installation

Activate the central Spack installation with:

source /cluster/spack/0.23.0/share/spack/setup-env.sh

Use it as a starting point for your own builds without rebuilding everything from scratch.

Add these files to ~/.spack:

~/.spack/upstreams.yaml:

upstreams:
  central-spack:
    install_tree: /cluster/spack/opt

~/.spack/config.yaml:

config:
  install_tree:
    root: $HOME/spack/opt/spack
  source_cache: $HOME/spack/cache
  license_dir: $HOME/spack/etc/spack/licenses

~/.spack/modules.yaml:

modules:
  default:
    roots:
      lmod: $HOME/spack/share/spack/lmod
    enable: [lmod]
    lmod:
      all:
        autoload: direct
      hide_implicits: true
      hierarchy: []

Add these lines to your ~/.bashrc:

export MODULEPATH=$MODULEPATH:$HOME/spack/share/spack/lmod/linux-almalinux9-x86_64/Core
. /cluster/spack/0.23.0/share/spack/setup-env.sh

Then run:

spack compiler find

Overriding Package Definitions

Create ~/.spack/repos.yaml:

repos:
  - $HOME/spack/var/spack/repos

And a local repo description in ~/spack/var/spack/repos/repo.yaml:

repo:
  namespace: overrides

Copy and edit a package definition, e.g., for ffmpeg:

cd ~/spack/var/spack/repos/
mkdir -p packages/ffmpeg
cp /cluster/spack/0.23.0/var/spack/repos/builtin/packages/ffmpeg/package.py packages/ffmpeg
vim packages/ffmpeg/package.py

Alternatively, you can use a fully independent Spack installation in your home directory or opt for EasyBuild.

VASP

Build configuration (MKL)

On Elysium, VASP can be built with Spack using:

spack install vasp@6.4.3 +openmp +fftlib ^openmpi@5.0.5 ^fftw@3+openmp ^intel-oneapi-mkl threads=openmp +ilp64

This configuration uses:

Intel oneAPI MKL (ILP64) for BLAS, LAPACK and ScaLAPACK,
VASP’s internal FFTLIB to avoid MKL CDFT issues on AMD,
OpenMPI 5.0.5 as MPI implementation,
OpenMP enabled for hybrid parallelisation.

We choose MKL as baseline because it is the de-facto HPC standard and performs well on AMD EPYC when AVX512 code paths are enabled.

Activating AVX512

Intel’s MKL only enables AVX512 optimisations on Intel CPUs. On AMD, MKL defaults to AVX2/SSE code paths.

To unlock the faster AVX512 kernels on AMD EPYC we provide libfakeintel, which fakes Intel CPUID flags.

MKL version	library to preload
≤ 2024.x	`/lib64/libfakeintel.so`
≥ 2025.x	`/lib64/libfakeintel2025.so`	works for older versions too

⚠ Intel gives no guarantee that all AVX512 instructions work on AMD CPUs. In practice, the community has shown that not every kernel uses full AVX512 width, but the overall speed-up is still substantial.

Activate AVX512 by preloading the library in your job:

export LD_PRELOAD=/lib64/libfakeintel2025.so:${LD_PRELOAD}

Test case 1 – Si256 (DFT / Hybrid HSE06)

This benchmark uses a 256-atom silicon supercell (Si256) with the HSE06 hybrid functional. Hybrid DFT combines FFT-heavy parts with dense BLAS/LAPACK operations and is therefore a good proxy for most large-scale electronic-structure workloads.

Baseline: MPI-only, 1 node

Configuration	Time [s]	Speed-up vs baseline
MKL (no AVX512)	2367	1.00×
MKL (+ AVX512)	2017	1.17×

→ Always enable AVX512. The baseline DFT case runs 17 % faster with libfakeintel,

Build configuration (AOCL)

AOCL (AMD Optimized Libraries) is AMD’s analogue to MKL, providing:

AMDBLIS (BLAS implementation)
AMDlibFLAME (LAPACK)
AMDScaLAPACK, AMDFFTW optimised for AMD EPYC
built with AOCC compiler

Build example:

spack install vasp@6.4.3 +openmp +fftlib %aocc ^amdfftw@5 ^amdblis@5 threads=openmp ^amdlibflame@5 ^amdscalapack@5 ^openmpi

AOCL detects AMD micro-architecture automatically and therefore does not require libfakeintel.

Baseline: MPI-only, 1 node

Configuration	Time [s]	Speed-up vs baseline
MKL (+ AVX512)	2017	1.00
AOCL (AMD BLIS / libFLAME)	1919	1.05

The AOCl build is another 5% faster than MKL with AVX512 enabled.

Hybrid parallelisation and NUMA domains

Each compute node has two EPYC 9254 CPUs with 24 cores each (48 total). Each CPU is subdivided into 4 NUMA domains with separate L3 caches and memory controllers.

MPI-only: 48 ranks per node (1 per core).
Hybrid L3: 8 MPI ranks × 6 OpenMP threads each, bound to individual L3 domains.

This L3-hybrid layout increases memory locality, because each rank mainly uses its own local memory and avoids cross-socket traffic.

Single-node hybrid results (Si256)

Configuration	Time [s]	Speed-up vs MPI-only
MKL (L3 hybrid)	1936	1.04×
AOCL (L3 hybrid)	1830	1.05×

Hybrid L3 adds a modest 4-5 % speed-up.

Multi-node scaling (Si256)

Configuration	Nodes	Time [s]	Speed-up vs 1-node baseline
MKL MPI-only	2	1305	1.55×
AOCL MPI-only	2	1142	1.68×
MKL L3 hybrid	2	1147	1.69×
AOCL L3 hybrid	2	968	1.89×

Interpretation

AOCL shows the strongest scaling across nodes; MKL’s hybrid variant catches up in scaling compared to its MPI-only counterpart. The L3-hybrid layout maintains efficiency even in the multi-node regime.

Recommendations for DFT / Hybrid-DFT workloads

AOCL generally outperforms MKL (+AVX512) on AMD EPYC.
Prefer L3-Hybrid (8×6) on single-node and even multi-node jobs for FFT-heavy hybrid-DFT cases.
For pure MPI runs, both MKL (+AVX512) and AOCL scale well; AOCL slightly better.
Always preload libfakeintel2025.so if MKL is used.

Jobscript examples

AOCL – Hybrid L3 (8×6)

#!/bin/bash
#SBATCH -J vasp_aocl_l3hyb
#SBATCH -N 1
#SBATCH --ntasks=8
#SBATCH --cpus-per-task=6
#SBATCH -p cpu
#SBATCH -t 48:00:00
#SBATCH --exclusive

module purge
module load vasp-aocl

export OMP_NUM_THREADS=6
export OMP_PLACES=cores
export OMP_PROC_BIND=close
export BLIS_NUM_THREADS=6

mpirun -np 8 --bind-to l3 --report-bindings vasp_std

MKL (+AVX512) – Hybrid L3 (8×6)

#!/bin/bash
#SBATCH -J vasp_mkl_avx512_l3hyb
#SBATCH -N 1
#SBATCH --ntasks=8
#SBATCH --cpus-per-task=6
#SBATCH -p cpu
#SBATCH -t 48:00:00
#SBATCH --exclusive

module purge
module load vasp-mkl
export LD_PRELOAD=/lib64/libfakeintel2025.so:${LD_PRELOAD}

export OMP_NUM_THREADS=6
export OMP_PLACES=cores
export OMP_PROC_BIND=close
export MKL_NUM_THREADS=6
export MKL_DYNAMIC=FALSE

mpirun -np 8 --bind-to l3 --report-bindings vasp_std

Test case 2 – XAS (Core-level excitation)

The XAS Mn-in-ZnO case models a core-level excitation (X-ray Absorption Spectroscopy). These workloads are not FFT-dominated; instead they involve many unoccupied bands and projector evaluations.

Single-node results (XAS)

Configuration	Time [s]	Relative
MKL MPI-only	897	1.00×
AOCL MPI-only	905	0.99×
MKL L3 hybrid	1202	0.75×
AOCL L3 hybrid	1137	0.79×

Multi-node scaling (XAS)

Configuration	Nodes	Time [s]	Relative
MKL MPI-only	2	1333	0.67×
AOCL MPI-only	2	1309	0.69×
MKL L3 hybrid	2	1366	0.66×
AOCL L3 hybrid	2	1351	0.67×

Interpretation

For core-level / XAS calculations, hybrid OpenMP parallelisation is counter-productive, and scaling beyond one node deteriorates performance due to load imbalance and communication overhead.

Recommendations for XAS and similar workloads

Use MPI-only and single-node configuration.
MKL and AOCL perform identically within margin of error.
Hybrid modes reduce efficiency and should be avoided.
Set OMP_NUM_THREADS=1 to avoid unwanted OpenMP activity.

General guidance

For optimal performance on Elysium with AMD EPYC processors, we recommend using the AOCL build as the default choice for all VASP workloads. AOCL consistently outperforms or matches MKL (+AVX512) across tested scenarios (e.g., 5 % faster for Si256 single-node, up to 1.89× speedup for multi-node scaling) and does not require additional configuration like libfakeintel. However, MKL remains a robust alternative, especially for users requiring compatibility with existing workflows.

Workload type	Characteristics	Recommended setup
Hybrid DFT (HSE06, PBE0, etc.)	FFT + dense BLAS, OpenMP beneficial	AOCL L3 Hybrid (8×6)
Standard DFT (PBE, LDA)	light BLAS, moderate FFT	AOCL L3 Hybrid or MPI-only
Core-level / XAS / EELS	many unoccupied bands, projectors	AOCL MPI-only (single-node)
MD / AIMD (>100 atoms)	large FFTs per step	AOCL L3 Hybrid
Static small systems (<20 atoms)	few bands, small matrices	AOCL MPI-only

Recommendations:

Default to AOCL: Use the AOCL build for all workloads unless specific constraints (e.g., compatibility with Intel-based tools) require MKL.
AVX512 for MKL: If using MKL, always preload libfakeintel2025.so to enable AVX512 optimizations.
Benchmark if unsure: Test both MPI-only and L3 Hybrid on one node to determine the optimal configuration for your specific system.

Software

Subsections of Software

Modules

Compilers

MPI Libraries

Mathematical Libraries

Programming Languages

Tools and Utilities

Spack

Table of Contents

Quick Setup with rub-deploy-spack-configs

Guide to Using Spack

Searching for Packages

Viewing Package Variants

Checking the Installation

Enabling/Disabling Variants

Specifying Compilers

Specifying Dependencies

Putting It All Together

Building and Adding a New Compiler

Comparing installed package variants

List installed packages with variants

Inspect specific installations

Compare two installations

Removing Packages

Central Spack Installation

Overriding Package Definitions

VASP

Build configuration (MKL)

Activating AVX512

Test case 1 – Si256 (DFT / Hybrid HSE06)

Baseline: MPI-only, 1 node

Build configuration (AOCL)

Baseline: MPI-only, 1 node

Hybrid parallelisation and NUMA domains

Single-node hybrid results (Si256)

Multi-node scaling (Si256)

Recommendations for DFT / Hybrid-DFT workloads

Jobscript examples

AOCL – Hybrid L3 (8×6)

MKL (+AVX512) – Hybrid L3 (8×6)

Test case 2 – XAS (Core-level excitation)

Single-node results (XAS)

Multi-node scaling (XAS)

Recommendations for XAS and similar workloads

General guidance

Quick Setup with `rub-deploy-spack-configs`