Run Large Language Models Locally using ollama

3 min readApr 24, 2024

What is ollama?

In the dynamic realm of AI, imagine you’re eager to test various open-source Large Language Models (LLMs) quickly for your specific needs. Ollama steps in to assist by enabling you to run these open-source LLMs directly on your local machine with a few simple steps. Ollama is an AI tool designed to enable users to set up and execute large language models like Llama 3 locally.

Installation

Download ollama from https://ollama.com/ official website and install ollama cli.

Available commands

 serve       Start ollama
 create      Create a model from a Modelfile
 show        Show information for a model
 run         Run a model
 pull        Pull a model from a registry
 push        Push a model to a registry
 list        List models
 cp          Copy a model
 rm          Remove a model
 help        Help about any command

Run Model Locally

In this blog, we will download and run the llama3 8b model locally.

Step 1:

ollama pull llama3:8b

Step 2:

ollama run llama3:8b

Now you are all set to infer the Model!

CLI

Rest API

curl -X POST http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt":"Why is the sky blue?"
 }'

Node

npm install ollama

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'llama2',
  messages: [{ role: 'user', content: 'Why is the sky blue?' }],
})
console.log(response.message.content)

Python

pip install ollama

import ollama
response = ollama.chat(model='llama2', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])
print(response['message']['content'])

LangChain

from langchain_community.llms import Ollama
llm = Ollama(model="llama3")
llm.invoke("Why is the sky blue?")

LlamaIndex

from llama_index.llms.ollama import Ollama
llm = Ollama(model="llama3")
llm.complete("Why is the sky blue?")

Customizations

Models from the Ollama library can be customized with a prompt. For example, to customize the llama3 model:

ollama pull llama3

Create a Modelfile:

FROM llama3

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system message
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""

Next, create and run the model:

ollama create mario -f ./Modelfile
ollama run mario
>>> Who are you?
Hello! It's your friend Mario.

Available Models

Useful Links to Begin

Ollama Github: https://github.com/ollama/ollama

Ollama with langChain:

https://python.langchain.com/docs/integrations/llms/ollama

https://python.langchain.com/docs/integrations/chat/ollama

Run Large Language Models Locally using ollama

Written by Vaibhav Phutane

No responses yet