Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
habrok:examples:llms [2026/02/26 12:49] fokkehabrok:examples:llms [2026/02/26 13:23] (current) fokke
Line 32: Line 32:
 <code>srun --nodes=1 --ntasks=1 --partition=gpushort --mem=120G --time=04:00:00 --gres=gpu:a100:1 --pty bash</code> <code>srun --nodes=1 --ntasks=1 --partition=gpushort --mem=120G --time=04:00:00 --gres=gpu:a100:1 --pty bash</code>
  
-2. Load the Python module you used for installation+2. Switch to the EESSI software stack 
 +<code>module load EESSI/2025.06</code> 
 + 
 +3. Load the Python module you used for installation
 <code>module load Python/3.13.5-GCCcore-14.3.0</code> <code>module load Python/3.13.5-GCCcore-14.3.0</code>
  
-3. Activate the venv you created earlier:+4. Activate the venv you created earlier:
 <code>source .env/bin/activate</code> <code>source .env/bin/activate</code>
  
-4. Run ''vllm'' with the appropriate parameters (these are some examples):+5. Run ''vllm'' with the appropriate parameters (these are some examples):
 <code>export HF_HOME=/tmp && vllm serve neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a16 --download-dir /tmp/models --max-model-len 1024 --gpu-memory-utilization 0.95 --port 8192</code> <code>export HF_HOME=/tmp && vllm serve neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a16 --download-dir /tmp/models --max-model-len 1024 --gpu-memory-utilization 0.95 --port 8192</code>
 explanations of some of the parameters: explanations of some of the parameters: