Every check the ChatsControl Translation QA Validator runs, fully documented. Names, MQM categories, severity defaults, language coverage, and worked examples. Unlike competitor tools that bundle vague "55 checks", we publish the complete list — so you know exactly what we look for and why.
locale-quotation-marks
Locale Quotation Marks
MQM: locale-convention · Quotation marksSeverity: minor
Flags ASCII straight quotes (`"..."`) in the target when the target locale's CLDR convention uses different characters (e.g. `„..."` for German, `«...»` for Russian/Ukrainian, `「...」` for Japanese).
Source
She said "hello".
Bad target
Sie sagte "hallo".
Good target
Sie sagte „hallo".
LLM translations are notoriously inconsistent on quotation style. Professional translators and style guides treat locale-correct quotation marks as table stakes — wrong quotes signal amateur output to clients and reviewers.
Language coverage: All 101 CLDR locales
locale-decimal-separator
Locale Decimal Separator
MQM: locale-convention · Decimal separatorSeverity: minor
Flags numbers in the target that use a different decimal separator than the locale CLDR data prescribes (English `.` vs German/Russian/Ukrainian `,`).
Source
The price is 1,234.50 EUR.
Bad target
Der Preis beträgt 1,234.50 EUR.
Good target
Der Preis beträgt 1.234,50 EUR.
Decimal/thousands confusion is a common factual error and reads as unprofessional to native readers. CLDR-grade locale awareness is rare in QA tools — most use English-centric heuristics.
Language coverage: All 101 CLDR locales
locale-group-separator
Locale Group Separator
MQM: locale-convention · Number formatSeverity: minor
Flags the thousands separator. Locales vary: `,` (en), `.` (de), non-breaking space (uk, fr, sv), `'` (de-CH).
Source
1,000,000 records
Bad target
1,000,000 записів
Good target
1 000 000 записів
Mostly cosmetic but consistently mis-formatting numbers makes the translation look like a machine dump.
Language coverage: All 101 CLDR locales
locale-percent-format
Locale Percent Format
MQM: locale-convention · Number formatSeverity: minor
Flags percent-sign spacing that diverges from the locale's CLDR pattern. German requires a non-breaking space (`5 %`), English is glued (`5%`).
Source
Inflation rose 5%.
Bad target
Inflation stieg um 5%.
Good target
Inflation stieg um 5 %.
Trivial individually, glaring across an entire document. Easy to miss in manual review.
Compares numeric values between source and target. Cross-locale aware: `1,234.50` (en) matches `1.234,50` (de). Date components (`15.03.2026`) are skipped to avoid false positives.
Source
Take 5.25 mg three times daily.
Bad target
Take 525 mg three times daily.
Good target
Take 5.25 mg three times daily.
The most common factual error in legal, medical, and financial translation. A misplaced decimal in a dosage instruction is a patient-safety failure mode; a wrong digit in a contract amount is a legal liability.
Language coverage: Language-agnostic; locale-aware via CLDR
untranslated-segment
Untranslated Segment
MQM: accuracy · UntranslatedSeverity: major
Flags segments where the target equals the source. Suppresses false positives for URLs, emails, dates, acronyms (<3 chars), pure-numeric strings, and known untranslatable tokens.
Source
Hello world.
Bad target
Hello world.
Good target
Привіт, світе.
Translators occasionally skip segments — paste-buffer mishaps, fatigue, or assuming a string is non-translatable. The check catches them while keeping noise low.
Language coverage: Language-agnostic
empty-target
Empty Target
MQM: accuracy · OmissionSeverity: critical
Flags segments where the source has translatable content but the target is empty or whitespace-only.
Source
This must be translated.
Bad target
Good target
Це треба перекласти.
A missing translation in a delivered file is a delivery failure.
Language coverage: Language-agnostic
tag-mismatch-html
HTML/XML Tag Mismatch
MQM: design · MarkupSeverity: major
Compares HTML/XML tag NAMES (not attribute values) between source and target. Translators may legitimately localize URLs or alt text inside attributes, but missing or extra tags break rendering. Self-closing structural tags (` `, ``) escalate to CRITICAL.
Source
Hello <b>world</b>!
Bad target
Привіт світ!
Good target
Привіт <b>світ</b>!
Dropping a `<br/>` breaks document layout. Dropping `<a>` destroys hyperlinks. Common in technical-content translation.
Language coverage: Language-agnostic
tag-mismatch-placeholder
Placeholder Mismatch
MQM: design · MarkupSeverity: critical
Flags missing/extra placeholders: ICU MessageFormat `{count}`, printf-style `%s`/`%d`/`%1$s`, and Python-style positional `{0}`. Missing named placeholders are CRITICAL — they break runtime rendering.
Source
You have {count} new messages.
Bad target
У вас є нові повідомлення.
Good target
У вас є {count} нових повідомлень.
Software localization staple. Drop a placeholder and the app renders broken strings to end-users.
Language coverage: Language-agnostic
punctuation-terminal
Terminal Punctuation
MQM: fluency · PunctuationSeverity: major
Flags mismatch between the source's terminal punctuation and the target's. Cross-script aware — `.` and `。` are equivalent.
Source
Are you ready?
Bad target
Ти готовий.
Good target
Ти готовий?
Changes sentence force (statement vs question).
Language coverage: Cross-script: CJK `。`, Greek `;`, Arabic `؟`, Armenian `:`, Devanagari `।`
punctuation-bracket-pair
Bracket / Quote Pair Balance
MQM: fluency · PunctuationSeverity: major
Flags unbalanced opening/closing pairs in the target. Catches the common copy-paste error of dropping a closing bracket.
Source
See note (a) for details.
Bad target
Див. примітку (а для деталей.
Good target
Див. примітку (а) для деталей.
Reads as broken; signals careless review.
Language coverage: Language-agnostic; pairs include `()`, `[]`, `{}`, `«»`, `„"`, `「」`
punctuation-double-space
Double Space
MQM: fluency · WhitespaceSeverity: minor
Flags consecutive spaces in the target text.
Source
Hello world
Bad target
Привіт світ
Good target
Привіт світ
Cosmetic but pervasive in machine output; trivial to fix once flagged.
Language coverage: Language-agnostic
grammar-languagetool
Grammar & Spelling (LanguageTool)
MQM: fluency · Grammar / SpellingSeverity: major
Sends the target text to LanguageTool's public API for grammar, spelling, and style checking. Rule matches are mapped to MQM Fluency category with severity by rule category.
Source
The cat sits on the mat.
Bad target
Die Katze sitz auf der Matte.
Good target
Die Katze sitzt auf der Matte.
Catches verb-conjugation errors, agreement bugs, and spelling mistakes that pure heuristic tools can't. LT-uk is particularly deep — 11k+ replace rules for Ukrainian.
Language coverage: ~45 languages via LanguageTool API (full grammar for en/de/fr/es/pt/it/nl/pl/uk/ru/ja/zh-CN; spelling+style elsewhere)
inconsistent-translation
Inconsistent Translation
MQM: terminology · InconsistentSeverity: major
Across a multi-segment document, flags the same source phrase (3+ words) being translated multiple different ways. The classic CAT-tool consistency check.
Source
(Two segments with identical source: 'The agreement is signed.')
Bad target
Угода підписана. → Договір підписано.
Good target
Угода підписана. → Угода підписана.
Legal and financial style guides mandate terminology consistency. Inconsistent translation of the same term creates ambiguity that courts may interpret against the translator/agency.
Language coverage: Language-agnostic
custom-regex
Custom Regex Rules
MQM: other · User-definedSeverity: major
User-supplied regex rules. Each rule has a name, pattern, severity, and `match_is_error` flag (default true — match = violation; false = match must be present, missing match = violation).
Client style guides often include forbidden terms or required phrasing. Custom rules let you encode them and run the check across every translation.
Language coverage: Language-agnostic
length-deviation
Length Deviation
MQM: design · LengthSeverity: major
Per-segment length sanity. Banded thresholds (target/source ratio): <0.30 = critical (likely lost content), 0.30–0.50 = major, 0.50–0.80 = minor; >3.0 = critical (likely duplicated content). Skipped for very short sources (<20 chars).
Source
The contract is signed by both parties present today.
Bad target
OK.
Good target
Договір підписано обома присутніми сьогодні сторонами.
Catches silent omissions and unintended additions. Particularly useful for UI strings where length constraints matter.
Language coverage: Language-agnostic
capitalization-mismatch
Capitalization Mismatch
MQM: style · CapitalizationSeverity: minor
Flags initial-letter case mismatch (source starts uppercase but target lowercase, or vice versa) and lost ALL-CAPS emphasis (source `WARNING` rendered as target `Warning`). Does NOT enforce Title Case parity — that's a deliberate locale-specific choice.
Source
WARNING: hot surface.
Bad target
Увага: гаряча поверхня.
Good target
УВАГА: гаряча поверхня.
ALL-CAPS warnings, button labels, and headings carry semantic weight — translators routinely flatten them to sentence case out of habit. Initial-letter mismatch is usually a copy-paste artefact.
Language coverage: All cased-script languages (Latin, Cyrillic, Greek, Armenian)
spaces-around-punctuation
Spaces Around Punctuation
MQM: locale-convention · WhitespaceSeverity: minor
Locale-aware whitespace check. Forbids `space + .,;:?!` for most European targets; requires a non-breaking space BEFORE `;`/`:`/`?`/`!`/`»` for French targets, per Imprimerie nationale convention.
Source
Are you sure?
Bad target
Êtes-vous sûr?
Good target
Êtes-vous sûr ?
French typography is precise about non-breaking spaces before double-width punctuation; translators using non-French keyboards consistently miss this. The reverse (English text with `?` preceded by space) is a typo signal.
Language coverage: French (required-space-before); English / German / Slavic / Romance / etc. (forbidden-space-before)
script-normalization
Script Normalization
MQM: locale-convention · Quotation marksSeverity: minor
Ukrainian-specific: flags ASCII `'` or curly `'`/`ʼ` apostrophes inside Ukrainian words. The Ukrainian orthography prescribes the right single quotation mark `’` (U+2019). The check ignores apostrophes at word boundaries (those are quotation marks, not Ukrainian apostrophes).
Source
ten o'clock
Bad target
м'ясо
Good target
м’ясо
The 2019 Ukrainian orthography update made the U+2019 apostrophe mandatory in formal text. Publishers, legal documents, and many agencies reject submissions with ASCII apostrophes.
Language coverage: Ukrainian (apostrophe variants); Russian Ё/Е check planned
false-friends
False Friends
MQM: accuracy · MistranslationSeverity: major
Flags well-known cognate traps: source contains a trigger word (`actually`, `magazine`, `decade`, etc.) AND target contains the wrong cognate (`актуально`, `магазин`, `декада`). Curated, conservative — only entries where the cognate is unambiguously wrong.
Source
What does the magazine say?
Bad target
Что говорит магазин?
Good target
Что говорит журнал?
False friends are the #1 source of professional embarrassment in EN↔RU/UK translation — and the most common LLM error in well-trained models, since the surface-form similarity is exactly the wrong signal.
Language coverage: Curated lists for en↔uk, en↔ru, uk↔ru (~30 entries)
glossary-adherence
Glossary Adherence
MQM: terminology · Inconsistent with termbaseSeverity: major
When the client supplies a glossary (CSV, TBX, or pasted `source target` lines), this check fires for every segment where source contains a glossary source term but target is missing the prescribed translation. Surface-form whole-word match — for Slavic targets, lemma-based via pymorphy3 (W8.5).
Source
Open the settings menu.
Bad target
Відкрийте меню налаштувань.
Good target
Відкрийте меню параметрів.
Glossary adherence is THE single most-rejected QA criterion at agencies. Clients ship curated termbases for a reason — ignoring them invalidates the entire delivery and triggers rework.
Language coverage: Any language pair (depends on glossary content)
mixed-script-in-target
Mixed Script in Target
MQM: locale-convention · GrammarSeverity: info
Flags Latin-script runs of two or more letters embedded in a Cyrillic-script target. The reviewer decides whether to transliterate (`Michael` → `Майкл`) or preserve verbatim (brand names, scientific symbols). Surfaces the inconsistency so the same convention can be applied document-wide.
Source
c/o Dr. Michael Zlatkin, MHA GmbH
Bad target
через д-ра Michael Zlatkin, MHA ТОВ
Good target
через д-ра Майкла Златкіна, ТОВ «МНА»
Mixed-script targets are a frequent invisible defect — translators transliterate some proper nouns but preserve others, producing documents that read inconsistently and look unprofessional. INFO severity because preservation is sometimes correct.
Language coverage: Cyrillic targets: uk, ru, be, bg, sr, mk
ocr-placeholder-in-source
OCR Placeholder in Source
MQM: other · MistranslationSeverity: info
Flags source segments that contain OCR-placeholder glyphs (`■`, `□`, `▪`, `�`, or runs of `^^^`). These mark characters the OCR engine could not recognise from a scanned PDF. Any numbers or words near them in the translation may be LLM reconstructions, not real translations — and must be cross-checked against the original scan.
Source
Register A no. ■■2/2025 issued on ^^^.07.2025
Bad target
Register A no. 2553/2025 issued on 15.07.2025
Good target
Register A no. ■■2/2025 issued on ?.07.2025 — verify against scan
OCR garbage upstream is invisible to most QA tools and silently produces hallucinated numbers and reconstructed dates in the translation. Surfacing the source-side OCR damage lets a human verify the original document before the translation reaches the client.
Language coverage: All languages
run-join
Run-Join (Concatenated Tokens)
MQM: design · MarkupSeverity: major
Flags target tokens where two words appear to be concatenated without a separating space (`спортуВідділення`, `UkraineDénomination`, `25981Service`). These are usually DOCX run-join defects from paragraph extraction or LLM post-processing artefacts — not legitimate compound words.
Source
Sports Department of the Magistrate
Bad target
спортуВідділення магістрату
Good target
Відділення спорту магістрату
Run-join defects break reading and are nearly invisible during casual proofreading because the eye smooths over them. Existing QA tools rarely surface this class because most spell-checkers either accept the concatenation (LT-style fuzzy) or bury it among hundreds of unrelated spelling flags.
Language coverage: All languages with Latin or Cyrillic script
forbidden-terms
Forbidden Terms (Do-Not-Translate List)
MQM: terminology · Inconsistent with termbaseSeverity: major
Inverse glossary — when the client supplies a do-not-use term list, every appearance of those terms in the target fires a finding. Use for competitor brand mentions, archaic / deprecated terminology, regulator-banned phrasing, banned spellings of the client brand.
Source
Schedule a Bing search demo for the client.
Bad target
Заплануй демо пошуку Bing для клієнта.
Good target
Заплануй демо пошуку для клієнта.
Forbidden-term lists are how regulated industries (medical, pharma, financial) and brand-strict clients enforce compliance. Missing one violates the contract and triggers rework. Whole-word, Unicode-aware matching, optional case-sensitivity.
Language coverage: All languages
date-format
Date Format (Locale)
MQM: locale-convention · Date formatSeverity: minor
Flags target dates that use the wrong format for the target locale — `12/31/2025` (MDY-slash) preserved in a German translation where `31.12.2025` (DMY-dot) is expected; `31.12.2025` left in an American English translation that should read `12/31/2025`.
Source
Issued on 12/31/2025.
Bad target
Ausgestellt am 12/31/2025.
Good target
Ausgestellt am 31.12.2025.
Date-format drift is the #1 invisible defect in legal and medical translations — the reviewer reads the number, not the convention, and lets the wrong format ship. Trados / Xbench catch this for the biggest locales; we cover the long tail (Slavic, Nordic, Baltic, East Asian).
Language coverage: 50+ locales (DMY-dot, DMY-slash, MDY-slash, YMD-dash, YMD-dot conventions)
currency-format
Currency Format (Symbol Position + Spacing)
MQM: locale-convention · Currency formatSeverity: minor
Flags currency strings that use the wrong symbol position or spacing for the target locale — `$5.00` (prefix, no space) preserved in a German translation where `5,00 €` (suffix, NBSP, comma decimal) is expected. Covers prefix-locale targets too.
Source
Total: $5.00.
Bad target
Gesamt: $5.00.
Good target
Gesamt: 5,00 €.
Currency convention is one of the first things a native reader notices when something feels off. Most QA tools only check decimal/group separators; we also check symbol position and NBSP requirements per locale.
Language coverage: 35+ locales (prefix/suffix rules, NBSP requirements)
hunspell-spell-check
Hunspell Spell Check (Offline)
MQM: fluency · SpellingSeverity: minor
Offline spell-check using the same Hunspell engine that LibreOffice and Firefox use. Runs in a sibling container so the main backend stays lean. Unlike the LanguageTool API, Hunspell has no rate limits and works without internet.
Source
The patient has tuberculosis.
Bad target
Der Patient hat tuberkilose.
Good target
Der Patient hat Tuberkulose.
Hunspell complements LanguageTool — together they catch more real misspellings, and we can fall back to one when the other doesn't ship a dictionary for a given language. Findings are MINOR by default (dictionary-based checks miss specialised vocabulary on medical / legal / scientific text).
Language coverage: en, uk, ru, de, es (Hunspell dictionaries — more languages coming)
language-sanity
Language Sanity (Wrong-File Detection)
MQM: other · User-defined ruleSeverity: info
Compares the language you declared in the language dropdowns to the actual language detected in the segments. If the declared source is `uk` but the detected language is `ru`, you get an INFO finding telling you to double-check the file pair.
Source
Этот текст полностью на русском языке.
Bad target
Цей текст українською.
Good target
(re-upload after picking the correct language)
The most embarrassing kind of QA failure is wrong-file-upload — the reviewer ploughs through dozens of noisy findings before realising the language pair was wrong from the start. We catch this in the first 30 characters of either side.
Language coverage: 75+ languages via Lingua language detection
Run these checks on your translation
Free, browser-based, no signup. 101 languages supported. MQM-compatible JSON export.