Using Ant¶
Login¶
There is 1 login node:
Hostname |
Node type |
---|---|
|
Ant login node |
Host key fingerprint:
Algorithm |
Fingerprint (SHA256) |
---|---|
RSA |
|
ECDSA |
|
ED25519 |
|
Building software¶
ESPResSo¶
Release 4.3:
# last update: March 2024
module load spack/default gcc/12.3.0 cuda/12.3.0 openmpi/4.1.6 \
fftw/3.3.10 boost/1.83.0 cmake/3.27.9 python/3.12.1
git clone --recursive --branch python --origin upstream \
https://github.com/espressomd/espresso.git espresso-4.3
cd espresso-4.3
python3 -m venv venv
source venv/bin/activate
python3 -m pip install -c "requirements.txt" numpy scipy vtk h5py setuptools cython==3.0.6
mkdir build
cd build
cp ../maintainer/configs/maxset.hpp myconfig.hpp
sed -i "/ADDITIONAL_CHECKS/d" myconfig.hpp
cmake .. -D CMAKE_BUILD_TYPE=Release -D ESPRESSO_BUILD_WITH_CCACHE=OFF \
-D ESPRESSO_BUILD_WITH_CUDA=ON -D CMAKE_CUDA_ARCHITECTURES="86" \
-D CUDAToolkit_ROOT="${CUDA_HOME}" \
-D ESPRESSO_BUILD_WITH_WALBERLA=ON -D ESPRESSO_BUILD_WITH_WALBERLA_AVX=ON \
-D ESPRESSO_BUILD_WITH_SCAFACOS=OFF -D ESPRESSO_BUILD_WITH_HDF5=OFF
make -j 64
SITE_PACKAGES=$(python3 -c 'import sysconfig;print(sysconfig.get_path("platlib"))')
echo $(realpath ./src/python) > "${SITE_PACKAGES}/espresso.pth"
deactivate
Loading software¶
With EESSI:
# last update: October 2024
[user@ant ~]$ source /cvmfs/software.eessi.io/versions/2023.06/init/bash
{EESSI 2023.06} [user@ant ~]$ module load ESPResSo/4.2.2-foss-2023b
{EESSI 2023.06} [user@ant ~]$ module load pyMBE/0.8.0-foss-2023b
{EESSI 2023.06} [user@ant ~]$ python3 -c "import pyMBE"
{EESSI 2023.06} [user@ant ~]$ python3 -c "import espressomd"
With Spack:
# last update: October 2024
module load spack/default
module load gcc/12.3.0 openmpi/4.1.6 cuda/12.3.0
Submitting jobs¶
Caution
The default walltime for jobs on ant is set to 10 minutes. For longer jobs, explicitly set the walltime in your SLURM script. Similarly, the default RAM per allocated CPU is set to 2GB. Adapt your SLURM script if you require more memory!
#SBATCH --time=05:00:00 # for 5 hours
#SBATCH --mem-per-cpu=5G # for 5GB per allocated CPU
Batch command:
sbatch --job-name="test" --nodes=1 --ntasks=4 --mem-per-cpu=2GB job.sh
Job script:
#!/bin/bash
#SBATCH --time=00:10:00
#SBATCH --output %j.stdout
#SBATCH --error %j.stderr
module load spack/default gcc/12.3.0 cuda/12.3.0 openmpi/4.1.6 \
fftw/3.3.10 boost/1.83.0 python/3.12.1
source espresso-4.3/venv/bin/activate
srun --cpu-bind=cores python3 espresso-4.3/testsuite/python/particle.py
deactivate
Benchmarks¶
Multi-GPU job¶
Run mpi4py on multiple nodes, one CPU per GPU, and make only one GPU visible per CPU.
Environment:
python3 -m venv venv
. venv/bin/activate
pip install mpi4py pycuda "numpy<2"
Launcher (gpu_vis_wrapper
):
#!/bin/bash
CUDA_VISIBLE_DEVICES=$((${SLURM_PROCID} % ${SLURM_GPUS_ON_NODE})) $*
Executor (list_cuda.py
):
from mpi4py import MPI
import pycuda.driver as cuda
import os
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
cuda_visible_devices = os.environ.get("CUDA_VISIBLE_DEVICES", "")
host_name = os.environ.get("SLURMD_NODENAME", "")
print(f"{rank=} {host_name=} {cuda_visible_devices=}")
import pycuda.autoinit
cuda.init()
device_count = cuda.Device.count()
print(f"{rank=} {host_name=} Number of CUDA devices available: {device_count}")
for i in range(device_count):
device = cuda.Device(i)
print(f"{rank=} {host_name=} Device {i}: {device.name()} - Memory: {device.total_memory() // (1024**2)} MB")
Output:
$ srun --nodes=2 -J mpi4py --ntasks-per-node=2 --gres=gpu:2 --mem-per-cpu=100MB \
--time=00:02:00 bash ./gpu_vis_wrapper python3 ./list_cuda.py
rank=0 host_name='compute02' cuda_visible_devices='0'
rank=0 host_name='compute02' Number of CUDA devices available: 1
rank=0 host_name='compute02' Device 0: NVIDIA L4 - Memory: 22478 MB
rank=1 host_name='compute02' cuda_visible_devices='1'
rank=1 host_name='compute02' Number of CUDA devices available: 1
rank=1 host_name='compute02' Device 0: NVIDIA L4 - Memory: 22478 MB
rank=2 host_name='compute03' cuda_visible_devices='0'
rank=2 host_name='compute03' Number of CUDA devices available: 1
rank=2 host_name='compute03' Device 0: NVIDIA L4 - Memory: 22478 MB
rank=3 host_name='compute03' cuda_visible_devices='1'
rank=3 host_name='compute03' Number of CUDA devices available: 1
rank=3 host_name='compute03' Device 0: NVIDIA L4 - Memory: 22478 MB