Skip to content

Ollama

Ollama is a desktop application that lets you download and run models locally.

Running tools locally may require additional GPU resources depending on the model you are using.

Use the ollama provider to access Ollama models.

  1. Start the Ollama application or

    Terminal window
    ollama serve
  2. Update your script to use the ollama:phi3.5 model (or any other model or from Hugging Face).

    script({
    ...,
    model: "ollama:phi3.5",
    })

    GenAIScript will automatically pull the model, which may take some time depending on the model size. The model is cached locally by Ollama.

  3. If Ollama runs on a server or a different computer or on a different port, you have to configure the OLLAMA_HOST environment variable to connect to a remote Ollama server.

    .env
    OLLAMA_HOST=https://<IP or domain>:<port>/ # server url
    OLLAMA_HOST=0.0.0.0:12345 # different port

You can specify the model size by adding the size to the model name, like ollama:llama3.2:3b.

script({
...,
model: "ollama:llama3.2:3b",
})

You can also use GGUF models from Hugging Face.

script({
...,
model: "ollama:hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF",
})

You can conviniately run Ollama in a Docker container.

{
"features": {
"docker-in-docker": "latest"
}
}
Terminal window
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
  • stop and remove the Ollama containers
Terminal window
docker stop ollama && docker rm ollama

Aliases

The following model aliases are attempted by default in GenAIScript.

AliasModel identifier
embeddingsnomic-embed-text

Limitations