Run Large Language Models Locally using ollama

Vaibhav Phutane
3 min readApr 24, 2024

--

What is ollama?

In the dynamic realm of AI, imagine you’re eager to test various open-source Large Language Models (LLMs) quickly for your specific needs. Ollama steps in to assist by enabling you to run these open-source LLMs directly on your local machine with a few simple steps. Ollama is an AI tool designed to enable users to set up and execute large language models like Llama 3 locally.

Installation

Download ollama from https://ollama.com/ official website and install ollama cli.

Available commands

 serve       Start ollama
create Create a model from a Modelfile
show Show information for a model
run Run a model
pull Pull a model from a registry
push Push a model to a registry
list List models
cp Copy a model
rm Remove a model
help Help about any command

Run Model Locally

In this blog, we will download and run the llama3 8b model locally.

Step 1:

ollama pull llama3:8b

Step 2:

ollama run llama3:8b

Now you are all set to infer the Model!

CLI

Rest API

curl -X POST http://localhost:11434/api/generate -d '{
"model": "llama3",
"prompt":"Why is the sky blue?"
}'

Node

npm install ollama

import ollama from 'ollama'

const response = await ollama.chat({
model: 'llama2',
messages: [{ role: 'user', content: 'Why is the sky blue?' }],
})
console.log(response.message.content)

Python

pip install ollama

import ollama
response = ollama.chat(model='llama2', messages=[
{
'role': 'user',
'content': 'Why is the sky blue?',
},
])
print(response['message']['content'])

LangChain

from langchain_community.llms import Ollama
llm = Ollama(model="llama3")
llm.invoke("Why is the sky blue?")

LlamaIndex

from llama_index.llms.ollama import Ollama
llm = Ollama(model="llama3")
llm.complete("Why is the sky blue?")

Customizations

Models from the Ollama library can be customized with a prompt. For example, to customize the llama3 model:

ollama pull llama3

Create a Modelfile:

FROM llama3

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system message
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""

Next, create and run the model:

ollama create mario -f ./Modelfile
ollama run mario
>>> Who are you?
Hello! It's your friend Mario.

Available Models

Source: https://ollama.com/library

Useful Links to Begin

Ollama Github: https://github.com/ollama/ollama

Ollama with langChain:

https://python.langchain.com/docs/integrations/llms/ollama

https://python.langchain.com/docs/integrations/chat/ollama

--

--