Skip to main content
Generate text completions using locally hosted Ollama models. Unlike chat completion, this function is designed for single-turn text generation with optional system prompts, images, and custom templates.

Samples

Generate a completion

Get a text completion from a local model:
SELECT ai.ollama_generate(
    'llama2',
    'Explain what PostgreSQL is in one sentence'
)->'response';

Use a system prompt

Set a system prompt to control the model’s behavior:
SELECT ai.ollama_generate(
    'llama2',
    'What is a database?',
    system_prompt => 'You are a helpful database expert. Give concise answers.'
)->'response';

Add context for continuation

Continue a previous generation using context:
-- First generation
WITH first_gen AS (
    SELECT ai.ollama_generate('llama2', 'Tell me about databases') AS result
)
-- Continue with context
SELECT ai.ollama_generate(
    'llama2',
    'Tell me more about PostgreSQL specifically',
    context => (SELECT (result->'context')::text::int[] FROM first_gen)
)->'response';

Generate with images

Analyze images with vision-capable models:
SELECT ai.ollama_generate(
    'llava',
    'What do you see in this image?',
    images => ARRAY[(SELECT content FROM images WHERE id = 1)]
)->'response';

Configure model options

Customize the generation parameters:
SELECT ai.ollama_generate(
    'llama2',
    'Write a creative story',
    embedding_options => '{"temperature": 0.9, "top_p": 0.9}'::jsonb
)->'response';

Arguments

NameTypeDefaultRequiredDescription
modelTEXT-The Ollama model to use (e.g., llama2, mistral, codellama)
promptTEXT-The prompt to generate a response for
hostTEXTNULLOllama server URL (defaults to http://localhost:11434)
imagesBYTEA[]NULLArray of images for multimodal models
keep_aliveTEXTNULLHow long to keep the model loaded (e.g., 5m, 1h)
embedding_optionsJSONBNULLModel-specific options like temperature, top_p
system_promptTEXTNULLSystem prompt to set model behavior
templateTEXTNULLCustom prompt template
contextINT[]NULLContext from a previous generation for continuation
verboseBOOLEANFALSEEnable verbose logging for debugging

Returns

JSONB: The complete API response including:
  • model: Model used for generation
  • response: The generated text
  • context: Context array for continuation
  • created_at: Generation timestamp
  • done: Whether generation is complete
  • total_duration: Total time taken
  • prompt_eval_count: Number of tokens in prompt
  • eval_count: Number of tokens generated