Use Cases

A Content Generation Pipeline That Scales

JMJonas MeyerMay 21, 20264 min read

Generating one blog post with an LLM is easy. Generating hundreds of consistent, on-brand pieces a week, with review gates and predictable cost, is an engineering problem. This article shows how to build a content generation pipeline on Model Database that you can actually run at scale.

The goal is a system that turns a queue of briefs into draft content, runs quality checks automatically, and flags anything that needs a human before publishing.

Pipeline stages

Think of content generation as a series of small, inspectable steps rather than one giant prompt:

Breaking the work apart lets you use a cheaper model where quality is less critical and a stronger one where it matters. Model Database makes the swap a one-line change.

Stage one: outlines

from openai import OpenAI

client = OpenAI(
    base_url="https://modeldatabase.com/v1",
    api_key="mdb_live_...",
)

def outline(brief):
    resp = client.chat.completions.create(
        model="openai/gpt-4o-mini",
        messages=[
            {"role": "system", "content":
             "Turn the brief into a JSON outline with 'title' and "
             "'sections' (each with a heading and 2-3 talking points)."},
            {"role": "user", "content": brief},
        ],
        response_format={"type": "json_object"},
        temperature=0.4,
    )
    return resp.choices[0].message.content

A small model handles structure well and keeps this high-volume step inexpensive.

Stage two: drafting sections

Generating section by section keeps each call focused and lets you parallelize. For the writing itself, a stronger model earns its keep on coherence and voice.

def write_section(title, heading, points):
    resp = client.chat.completions.create(
        model="anthropic/claude-sonnet-4-6",
        messages=[
            {"role": "system", "content":
             "You are a brand writer. Active voice, no cliches, "
             "150-220 words per section."},
            {"role": "user", "content":
             f"Article: {title}\nSection: {heading}\n"
             f"Cover: {points}"},
        ],
        temperature=0.6,
    )
    return resp.choices[0].message.content

Because every section is an independent call, you can run them concurrently with a thread pool or async client and assemble the full draft when they finish.

Stage three: automated quality gates

Never publish raw model output. Add deterministic checks plus an LLM-based review pass. The deterministic checks are cheap and catch the obvious failures.

def passes_rules(text, banned, min_words):
    words = len(text.split())
    if words < min_words:
        return False, f"too short: {words}"
    for term in banned:
        if term.lower() in text.lower():
            return False, f"banned term: {term}"
    return True, "ok"

Then use a model as an editor that returns a structured verdict, so a human only sees pieces that need attention.

import json

def review(text):
    resp = client.chat.completions.create(
        model="anthropic/claude-sonnet-4-6",
        messages=[
            {"role": "system", "content":
             "Review the draft. Return JSON: "
             "{\"score\": 1-5, \"issues\": [...], \"publish\": bool}."},
            {"role": "user", "content": text},
        ],
        response_format={"type": "json_object"},
        temperature=0,
    )
    return json.loads(resp.choices[0].message.content)

Orchestration and throughput

Tie the stages together with a job queue so the pipeline is resilient and observable:

Tuning cost and quality

Once the pipeline runs, treat model choice as a dial. Start with openai/gpt-4o-mini everywhere, then upgrade only the stages where reviewers reject output. You might find drafting needs anthropic/claude-sonnet-4-6 while outlining and rule-checking stay on the small model. Since all of these are reachable through the same endpoint, A/B testing is just changing the model string and comparing review scores.

Ready to build it? Create a key and load credit at your dashboard, and review streaming and JSON-mode details in the docs.

← All articles Get your API key →