Differences

This shows you the differences between two versions of the page.

--- habrok:examples:llms [2025/07/28 11:23] – bob
+++ habrok:examples:llms [2026/02/26 13:23] (current) – fokke
@@ Line 1: / Line 1: @@
  ===== Running LLMs =====
-If you want to run a Large Language Model (LLM) on Habrok, here's one possible and relatively easy way to do it.
+If you want to run a Large Language Model (LLM) on Habrok, here's one possible and relatively easy way to do it. Note that the versions are recent as of 26 February 2026.
-. Login with your account on Habrok.
+==== Installation ====
-<code>ssh pnumber@login1.hb.hpc.rug.nl</code>
-. Start an interactive job on an A100 node (single GPU):
+. Login with your account on Habrok on an interactive node for the installation procedure.
-<code>srun --nodes=1 --ntasks=1 --partition=gpushort --mem=120G --time=04:00:00 --gres=gpu:a100:1 --pty bash</code>
+<code>ssh pnumber@interactive1.hb.hpc.rug.nl</code>
-. Load the Python and CUDA modules:
+. Since the vllm installation packages require a newer glibc than our operating system provides, we will switch to the EESSI software stack. This provides a compatability layer with a newer glibc.
-<code>module load Python/3.11.5-GCCcore-13.2.0 CUDA/12.1.1</code>
+<code>module load EESSI/2025.06</code>
+. Load the Python module in the version you would like to use:
+<code>module load Python/3.13.5-GCCcore-14.3.0</code>
 . Create a virtual environment (only once):
@@ Line 18: / Line 20: @@
 <code>source .env/bin/activate</code>
-. Upgrade ''pip'' (optional):
+. Upgrade ''pip'' and ''wheel'' (optional):
-<code>pip install --upgrade pip</code>
+<code>pip install --upgrade pip wheel</code>
 . Install ''vllm'' (you can also specify a version):
@@ Line 25: / Line 27: @@
 Might take a bit the first time.
-. Run ''vllm'' with the appropriate parameters (these are some examples):
+==== Running through an interactive job ====
+. Start an interactive job on an A100 node (single GPU) to be able to run the software:
+<code>srun --nodes=1 --ntasks=1 --partition=gpushort --mem=120G --time=04:00:00 --gres=gpu:a100:1 --pty bash</code>
+. Switch to the EESSI software stack
+<code>module load EESSI/2025.06</code>
+. Load the Python module you used for installation
+<code>module load Python/3.13.5-GCCcore-14.3.0</code>
+. Activate the venv you created earlier:
+<code>source .env/bin/activate</code>
+. Run ''vllm'' with the appropriate parameters (these are some examples):
 <code>export HF_HOME=/tmp && vllm serve neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a16 --download-dir /tmp/models --max-model-len 1024 --gpu-memory-utilization 0.95 --port 8192</code>
 explanations of some of the parameters: