What is Ollama?
Ollama is a tool for running large language models locally on your own hardware. Unlike cloud-based APIs, Ollama provides complete control over your models, data privacy, and costs. It supports popular open-source models like Llama, Mistral, and CodeLlama.Key features
- Privacy-first: All data stays on your infrastructure
- Cost-effective: No per-token API costs
- Offline operation: Works without internet connectivity
- Open-source models: Access to Llama 2, Mistral, CodeLlama, and more
- Full control: Manage model versions and configurations
Prerequisites
Before using Ollama functions, you need to:- Install and run Ollama on your infrastructure
- Pull the models you want to use
- Ensure your database can access the Ollama host
Quick start
Generate embeddings
Create vector embeddings using a local model:Generate completions
Get text completions from a local model:Chat completion
Have a conversation with a local model:Available functions
Embeddings
ollama_embed(): generate vector embeddings from text
Completions and chat
ollama_generate(): generate text completions with optional imagesollama_chat_complete(): multi-turn conversations with tool support
Model management
ollama_list_models(): list all locally installed modelsollama_ps(): show currently running models and their resource usage
Configuration
All Ollama functions accept ahost parameter to specify the Ollama server location: