Translating a product into a dozen languages is no longer a quarter-long project. LLMs produce fluent, context-aware translations that respect tone, preserve placeholders, and handle domain vocabulary. This article builds a translation pipeline on Model Database that localizes UI strings and content while keeping quality and cost under control.
The difference between a toy translator and a production one is everything around the model call: placeholder safety, glossary enforcement, batching, and review. We'll cover all of it.
Why an LLM over classic MT
Traditional machine translation is fast but context-blind. LLMs understand that "Save" in a toolbar is a verb, respect formality levels, and follow instructions like "keep the brand name untranslated." You trade a little latency for noticeably better output, and you can steer the result with a prompt.
Translating a single string safely
UI strings contain placeholders like {count} that must survive untouched. Instruct the model explicitly and keep temperature low for consistency.
from openai import OpenAI
client = OpenAI(
base_url="https://modeldatabase.com/v1",
api_key="mdb_live_...",
)
def translate(text, target_lang, glossary=""):
resp = client.chat.completions.create(
model="google/gemini-2.0-flash",
messages=[
{"role": "system", "content":
f"Translate to {target_lang}. Preserve placeholders like "
"{name} exactly. Keep HTML tags intact. Match the tone. "
f"Apply this glossary strictly:\n{glossary}\n"
"Return only the translation."},
{"role": "user", "content": text},
],
temperature=0,
)
return resp.choices[0].message.content
google/gemini-2.0-flash is fast and broadly multilingual, which makes it a strong default for high-volume localization. For literary or marketing copy where nuance matters, test anthropic/claude-sonnet-4-6.
Batching for throughput
Translating thousands of strings one request at a time is slow and wasteful. Batch related strings into a single call using JSON, which also gives the model surrounding context to translate consistently.
import json
def translate_batch(strings, target_lang):
resp = client.chat.completions.create(
model="google/gemini-2.0-flash",
messages=[
{"role": "system", "content":
f"Translate each value to {target_lang}. Keep keys and "
"placeholders unchanged. Return the same JSON shape."},
{"role": "user", "content": json.dumps(strings)},
],
response_format={"type": "json_object"},
temperature=0,
)
return json.loads(resp.choices[0].message.content)
Keep batches to a sensible size so a single bad response doesn't force you to retranslate everything. A few dozen strings per call is a good balance.
Enforcing a glossary
Brands need consistency: product names, legal terms, and feature labels must translate the same way every time. Pass a glossary in the prompt and validate the output against it.
def glossary_respected(translated, glossary):
for term, expected in glossary.items():
if term in translated and expected not in translated:
return False
return True
When a translation violates the glossary or drops a placeholder, retry once with the specific error, or route it to human review.
Validation that catches real bugs
- Placeholder check: confirm every
{token}in the source appears in the translation. This catches the most common and most damaging error. - Tag balance: verify HTML tags open and close as in the source.
- Length sanity: flag translations that are wildly longer or shorter than expected for the language.
import re
def placeholders_intact(src, dst):
pat = re.compile(r"\{[^}]+\}")
return set(pat.findall(src)) == set(pat.findall(dst))
Cost, caching, and review
Translation is naturally cacheable: the same source string in the same language always gives the same result. Cache by a hash of source text plus target language, and you'll only pay to translate each unique string once. With Model Database's prepaid pay-as-you-go billing, a one-time bulk translation of your catalog has a predictable cost you can estimate from token counts.
For regulated or high-visibility content, keep a human in the loop: machine-translate first, then have a reviewer approve. The model does the tedious 90 percent; people handle the nuanced last mile.
Picking the right model
Build a small evaluation set of source strings with known-good translations and run candidates against it. Use the fast model for bulk UI strings and a stronger one for prose. Switching is a single model string through the same endpoint, so you can mix models per content type without changing your architecture.
Create a key and load credit at your dashboard, and find JSON-mode and model details in the docs.