Skip to main content
Analyze text content to detect potential policy violations including hate speech, violence, sexual content, self-harm, and harassment. This uses OpenAI’s moderation API to help ensure your application complies with usage policies.

Samples

Check content for violations

Analyze text for potentially harmful content:
SELECT ai.openai_moderate(
    'text-moderation-latest',
    'I want to hurt someone'
);
Returns a JSON object with flagged categories and confidence scores:
{
  "id": "modr-...",
  "model": "text-moderation-007",
  "results": [{
    "flagged": true,
    "categories": {
      "violence": true,
      "harassment": true,
      ...
    },
    "category_scores": {
      "violence": 0.997,
      "harassment": 0.571,
      ...
    }
  }]
}

Check if content is flagged

Get a simple boolean result:
SELECT
    content,
    (ai.openai_moderate('text-moderation-latest', content)->
        'results'->0->>'flagged')::boolean AS is_flagged
FROM user_comments;

Filter by specific categories

Check for specific types of violations:
SELECT
    id,
    content,
    (ai.openai_moderate('text-moderation-latest', content)->
        'results'->0->'categories'->>'violence')::boolean AS has_violence
FROM posts
WHERE (ai.openai_moderate('text-moderation-latest', content)->
    'results'->0->'categories'->>'violence')::boolean = true;

Moderate user-generated content with triggers

Automatically flag problematic content:
CREATE TABLE comments (
    id SERIAL PRIMARY KEY,
    content TEXT,
    is_flagged BOOLEAN
);

CREATE OR REPLACE FUNCTION moderate_comment()
RETURNS TRIGGER AS $$
BEGIN
    NEW.is_flagged := (
        ai.openai_moderate('text-moderation-latest', NEW.content)->
        'results'->0->>'flagged'
    )::boolean;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER moderate_on_insert
    BEFORE INSERT ON comments
    FOR EACH ROW
    EXECUTE FUNCTION moderate_comment();

Arguments

NameTypeDefaultRequiredDescription
modelTEXT-The moderation model to use (e.g., text-moderation-latest, text-moderation-stable)
input_textTEXT-The text content to analyze
api_keyTEXTNULLOpenAI API key. If not provided, uses ai.openai_api_key setting
api_key_nameTEXTNULLName of the secret containing the API key
extra_headersJSONBNULLAdditional HTTP headers to include in the API request
extra_queryJSONBNULLAdditional query parameters for the API request
verboseBOOLEANFALSEEnable verbose logging for debugging
client_configJSONBNULLAdvanced client configuration options

Returns

JSONB: A JSON object containing moderation results with the following structure:
  • id: Unique identifier for the moderation request
  • model: The model used
  • results: Array of result objects (one per input)
    • flagged: Boolean indicating if content was flagged
    • categories: Object with boolean flags for each category
      • hate, hate/threatening
      • harassment, harassment/threatening
      • self-harm, self-harm/intent, self-harm/instructions
      • sexual, sexual/minors
      • violence, violence/graphic
    • category_scores: Object with confidence scores (0-1) for each category