Differences
This shows you the differences between two versions of the page.
| Next revision | Previous revision | ||
| habrok:examples:llms [2025/03/05 12:06] – created camarocico | habrok:examples:llms [2026/02/26 13:23] (current) – fokke | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| - | ===== Running LLMs ===== | + | ===== Running LLMs ===== |
| - | If you want to run a Large Language Model (LLM) on Habrok, here's one possible and relatively easy way to do it. | + | If you want to run a Large Language Model (LLM) on Habrok, here's one possible and relatively easy way to do it. Note that the versions are recent as of 26 February 2026. |
| + | |||
| + | ==== Installation ==== | ||
| + | |||
| + | 1. Login with your account on Habrok on an interactive node for the installation procedure. | ||
| + | < | ||
| + | |||
| + | 2. Since the vllm installation packages require a newer glibc than our operating system provides, we will switch to the EESSI software stack. This provides a compatability layer with a newer glibc. | ||
| + | < | ||
| + | |||
| + | 3. Load the Python module in the version you would like to use: | ||
| + | < | ||
| - | - Login with your account on Habrok (obviously).< | ||
| - | 2. Start an interactive job on an A100 node (single GPU): | ||
| - | | ||
| - | srun --nodes=1 --ntasks=1 --partition=gpushort --mem=120G --time=04: | ||
| - | ``` | ||
| - | 3. Load the Python and CUDA modules: | ||
| - | | ||
| - | | ||
| - | ``` | ||
| 4. Create a virtual environment (only once): | 4. Create a virtual environment (only once): | ||
| - | | + | < |
| - | python3 -m venv .env | + | |
| - | ``` | + | |
| 5. Activate the venv: | 5. Activate the venv: | ||
| - | | + | < |
| - | source .env/ | + | |
| - | ``` | + | 6. Upgrade |
| - | 6. Upgrade | + | < |
| - | ```bash | + | |
| - | pip install --upgrade pip | + | 7. Install '' |
| - | ``` | + | < |
| + | Might take a bit the first time. | ||
| + | |||
| + | ==== Running through an interactive job ==== | ||
| + | |||
| + | 1. Start an interactive job on an A100 node (single GPU) to be able to run the software: | ||
| + | < | ||
| + | |||
| + | 2. Switch to the EESSI software stack | ||
| + | < | ||
| + | |||
| + | 3. Load the Python module you used for installation | ||
| + | < | ||
| + | |||
| + | 4. Activate the venv you created earlier: | ||
| + | < | ||
| + | |||
| + | 5. Run '' | ||
| + | < | ||
| + | explanations of some of the parameters: | ||
| + | * '' | ||
| + | * The model is '' | ||
| + | * '' | ||
| + | * '' | ||
| + | |||
| + | Once '' | ||
| + | < | ||
| + | |||
| + | You can the test that it is working with: | ||
| + | < | ||
| + | |||
| + | and you should get something like: | ||
| + | |||
| + | < | ||
| + | |||
| + | or you can go to '' | ||
| + | |||
| + | < | ||
| + | " | ||
| + | " | ||
| + | { | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | { | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | } | ||
| + | ] | ||
| + | } | ||
| + | ] | ||
| + | }</ | ||
| + | |||
| + | ==== Running Ollama in a jobscript ==== | ||
| + | |||
| + | The following code can be used in a jobscript to run an Ollama model: | ||
| + | |||
| + | < | ||
| + | # Load the Ollama module | ||
| + | # GPU node | ||
| + | module load ollama/ | ||
| + | # CPU node | ||
| + | # module load ollama/ | ||
| + | |||
| + | # Use /scratch for storing models | ||
| + | export OLLAMA_MODELS=/ | ||
| + | |||
| + | # Start the Ollama server in the background, log all its output to ollama-serve.log | ||
| + | ollama serve >& ollama-serve.log & | ||
| + | # Wait a few seconds to make sure that the server has started | ||
| + | sleep 5 | ||
| + | |||
| + | # Run the model | ||
| + | echo "Tell me something about Groningen" | ||
| + | # Kill the server process | ||
| + | pkill -u $USER ollama | ||
| + | </ | ||