Unleashing the Power of AI: Running Large Language Models on Your Own Cloud Server (Digital Ocean)
Sonam Choeda
Posted on October 30, 2024
Introduction
In the rapidly evolving world of artificial intelligence, large language models (LLMs) have become increasingly accessible to developers and enthusiasts. This blog post will guide you through the process of setting up and running an LLM, specifically Ollama, on a cloud-based Linux server using DigitalOcean’s Droplets.
Why Run Your Own LLM?
Running your own LLM offers several advantages:
- Complete control over your AI model
- Enhanced privacy and data security
- Customization possibilities
- Cost-effective for long-term use
Setting Up Your Cloud Server
We’ll be using DigitalOcean’s Droplets for this tutorial. Here’s a quick overview of the setup process:
- Create a DigitalOcean account(https://www.digitalocean.com/)
- Choose a Droplet configuration (Ubuntu recommended)
- Select appropriate resources (8GB RAM minimum for most LLMs)
- Set up authentication (password or SSH key)
- Launch your Droplet.
Connecting to Your Droplet
Once your Droplet is running, you’ll need to connect to it via SSH. Use the following command in your terminal:
ssh root@your_droplet_ip_address
Installing Ollama
Ollama is an easy-to-use framework for running LLMs. To install it, run this command:
curl -fsSL <https://ollama.com/install.sh> | sh
Running Your First LLM
With Ollama installed, you can now run an LLM. For example, to run the Llama2 model:
ollama run llama2
Interacting with Your LLM
Once the model is loaded, you can start interacting with it by typing prompts. For example:
“Why is the sky blue?”
Running Ollama as a Server
By default, Ollama runs as a server on port 11434. You can access it at http://localhost:11434. To keep Ollama running continuously on your server, even after you’ve logged out, you can use a process manager like PM2. Here’s how to set it up:
- Install PM2 if you haven’t already:
npm install pm2 -g
- After installing PM2, we can run the ollama server
pm2 start "ollama serve" -n <name>
- Ensure PM2 starts on system reboot
pm2 startup systemd
pm2 save
Now Ollama will run continuously as a server, allowing you to interact with it even after closing your SSH session.
Conclusion
Setting up and running your own LLM on a cloud server opens up a world of possibilities for AI experimentation and development. As you become more comfortable with the process, you can explore different models, fine-tune them for specific tasks, or even create your own AI-powered applications.
Next Steps
Consider exploring:
- Different LLM models available through Ollama
- Fine-tuning models for specific use cases
- Integrating your LLM into other applications or services
Posted on October 30, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.