How to get up & running a LLM locally - in 5 minutes

hayerhans

hayerhans

Posted on March 23, 2024

How to get up & running a LLM locally - in 5 minutes

Video Version:
https://youtube.com/shorts/y0NWVUsfLiU?si=x16bKEoHLfk87nC2

What is Ollama?

It's a lightweight framework designed for those who wish to experiment with, customize, and deploy large language models without the hassle of cloud platforms. With Ollama, the power of AI is distilled into a simple, local package, allowing developers and hobbyists alike to explore the vast capabilities of machine learning models.

Setting Up Ollama: A Step-by-Step Approach

First download ollama for your OS here:
https://ollama.com/download

Second run the model you want with:

ollama run llama2

Model library

Ollama supports a list of models available on ollama.com/library

Here are some example models that can be downloaded:

Model Parameters Size Download Command
Llama 2 7B 3.8GB ollama run llama2
Mistral 7B 4.1GB ollama run mistral
Dolphin Phi 2.7B 1.6GB ollama run dolphin-phi
Phi-2 2.7B 1.7GB ollama run phi
Neural Chat 7B 4.1GB ollama run neural-chat
Starling 7B 4.1GB ollama run starling-lm
Code Llama 7B 3.8GB ollama run codellama
Llama 2 Uncensored 7B 3.8GB ollama run llama2-uncensored
Llama 2 13B 13B 7.3GB ollama run llama2:13b
Llama 2 70B 70B 39GB ollama run llama2:70b
Orca Mini 3B 1.9GB ollama run orca-mini
Vicuna 7B 3.8GB ollama run vicuna
LLaVA 7B 4.5GB ollama run llava
Gemma 2B 1.4GB ollama run gemma:2b
Gemma 7B 4.8GB ollama run gemma:7b

Memory Requirements:
Keep in mind, running these models isn't light on resources. Ensure you have at least 8 GB of RAM for 7B models, and more for the larger ones, to keep your AI running smoothly.

Customization

With Ollama, you're not just running models; you're tailoring them. Import models with ease and customize prompts to fit your specific needs. Fancy a model that responds as Mario? Ollama makes it possible with simple command lines:

Customize a prompt

Models from the Ollama library can be customized with a prompt. For example, to customize the llama2 model:

ollama pull llama2

Create a Modelfile:

FROM llama2


# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system message

SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
""" 
Enter fullscreen mode Exit fullscreen mode

Next, create and run the model:

ollama create mario -f ./Modelfile
ollama run mario

hi
Hello! It's your friend Mario.


If you liked this content also have a look at my YouTube channel

💖 💪 🙅 🚩
hayerhans
hayerhans

Posted on March 23, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related