Ollama

Ollama is a desktop application that lets you download and run models locally.

Running tools locally may require additional GPU resources depending on the model you are using.

Use the ollama provider to access Ollama models.

Start the Ollama application or
Terminal window
```
ollama serve
```
Update your script to use the ollama:phi3.5 model (or any other model or from Hugging Face).
```
script({
    ...,
    model: "ollama:phi3.5",
})
```
GenAIScript will automatically pull the model, which may take some time depending on the model size. The model is cached locally by Ollama.
If Ollama runs on a server or a different computer or on a different port, you have to configure the OLLAMA_HOST environment variable to connect to a remote Ollama server.
.env
```
OLLAMA_HOST=https://<IP or domain>:<port>/ # server url
OLLAMA_HOST=0.0.0.0:12345 # different port
```

You can specify the model size by adding the size to the model name, like ollama:llama3.2:3b.

script({
    ...,
    model: "ollama:llama3.2:3b",
})

Ollama with Hugging Face models

You can also use GGUF models from Hugging Face.

script({
    ...,
    model: "ollama:hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF",
})

Ollama with Docker

You can conviniately run Ollama in a Docker container.

if you are using a devcontainer or a GitHub Codespace, make sure to add the docker-in-docker option to your devcontainer.json file.

{
    "features": {
        "docker-in-docker": "latest"
    }
}

start the Ollama container

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

stop and remove the Ollama containers

docker stop ollama && docker rm ollama

Add these scripts to your package.json file to make it easier to start and stop the Ollama container.

{
    "scripts": {
        "ollama:start": "docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama",
        "ollama:stop": "docker stop ollama && docker rm ollama"
    }
}

Aliases

The following model aliases are attempted by default in GenAIScript.

Alias	Model identifier
embeddings	nomic-embed-text

Limitations

Uses OpenAI compatibility layer
logit_bias ignored
Ignore prediction of output tokens