====== Ollama on Hábrók ======

You can run an LLM on Hábrók with Ollama in a Jupyter environment by using the [[https://portal.hb.hpc.rug.nl/pun/sys/dashboard/batch_connect/sys/hb-ollama/session_contexts/new|Ollama (Jupyter)]] Interactive App on the [[https://portal.hb.hpc.rug.nl|Web Portal]].
===== Setting up the virtual environment =====

To be able to use the app, you first need to set up a Python virtual environment. The version of Ollama installed on Hábrók is ''ollama/0.6.0-GCCcore-12.3.0'', which means that the Python virtual environment needs to use a version of Python using the same ''GCCcore-12.3.0'' toolchain, which is ''Python/3.11.3-GCCcore-12.3.0''. Other versions of Python might work as well, but toolchain compatibilities can sometimes be an issue.

<code shell>
module load Python/3.11.3-GCCcore-12.3.0
python3 -m venv $HOME/venvs/ollama
</code>

Once the virtual environment has been built, you need to install ''jupyter'' and ''ollama'' within the virtual environment; optionally, you can also install additional packages, such as ''openai'':

<code bash>
source $HOME/venvs/ollama/bin/activate
pip install --upgrade pip
pip install jupyter ollama openai
</code>

Finally, to make sure that the Jupyter Notebook is aware of your virtual environment, you need to create a Jupyter kernel:
<code bash>
python3 -m ipykernel install --user --name=ollama --display-name="Ollama"
</code>
===== Choosing a folder for the models =====

Another important choice when running the app is where the Ollama models should be saved; there are two options, with advantages and drawbacks:
  * **Custom directory**: This is a folder on the shared filesystem (we recommend ''/scratch/$USER'' since the models are quite large) where the models can be downloaded and saved for use in the future. This way, you only need to download a model once, but it will take quite a bit of time to save the model files to the shared filesystem. It might also be slower to use a model saved here.
  * **Temporary directory**: This is a folder on the local disc of the node running the job, and is it considerably faster to save a downloaded model here; the drawback is that this is not persistent storage, and the model files will have to be downloaded for each session. However, it might also be a bit faster when using the model.
===== Simple usage example =====

To use Ollama in the Jupyter app, you need first to open a new notebook in the Jupyter app, and choose the **Ollama** Jupyter kernel built when setting up the virtual environment. Here is a small example which first imports the necessary packages:
<code python>
import os
import ollama

from openai import OpenAI
</code>
then downloads a model from Ollama:
<code python>
ollama.pull("gemma3:12b")
</code>
and also lists all currently downloaded models:
<code python>
for model in ollama.list().models:
    print(model.model)
</code>

It then creates a OpenAI API client:
<code python>
client = OpenAI(
    base_url=f"http://{os.environ['OLLAMA_HOST']}/v1",
    api_key="ollama"
)
</code>
and interacts with the LLM:
<code python>
response = client.chat.completions.create(
    model="gemma3:12b",
    messages = [
        {
            "role": "system",
            "content": "You are a friendly dog"
        },
        {
            "role": "user",
            "content": "Would you like a bone?"
        }
    ]
)
print(response.choices[0].message.content)
</code>

The model can, if desired, be deleted:
<code python>
ollama.delete("gemma3:12b")
</code>

You can find more info on how to use the Ollama Python library on their [[https://github.com/ollama/ollama-python|GitHub page]].