TQE-2026-1 · Inter-Annotator Agreement

About this project

This dashboard tracks inter-annotator agreement for the TQE-2026-1 evaluation project. A team of trained evaluators is working toward a shared agreement threshold by annotating the same translation quality evaluation projects in Label Studio in the domain of generative AI, using a customized MQM typology. After each project, agreement is measured across five levels: span detection, span boundaries, error category, subcategory, and impact rating; findings are used to calibrate the team for the next training round. Reports are generated from a Jupyter notebook and linked below. The goal is to reach deployable, consistent annotations that can be used to train evaluators, give feedback to translators, and fine-tune LLM-based quality evaluation systems.

Linguistic Assets

Specifications

Overall Specifications

Domain: Generative AI

Language families and variants: Largely Latin American Spanish, U.S. English

Audience - broadly: Spanish and English speakers in Latin America (including the United States)

Correspondence: Whole document adaptation

Glossary & Error Examples

GAI Glossary

GAI terminology, including contradictions in regional terminology to be navigated in translation

View glossary

MQM Error Examples

Examples of errors that could be tagged with multiple labels, and the classification we've agreed on for this project.

View example errors

Evaluation Projects

Project 1

Ecosistema digital
es-LATAM → en-US evaluation

This project evaluates a section of a translation of a Spanish-language academic paper on algorithmic governmentality. The source text covers the technical foundations of large language models and contextualizes how AI is being integrated into digital ecosystems across industries.

Full source text and target translation

Project-specific Specifications

Subdomains: Digital platforms, communication studies, political economy of technology, technical infrastructure of AI

Text type: Academic article

Purpose: Informative

Author perspective: Academic

Audience: Academics, Researchers, Students, Policy makers

Content complexities: The paper requires fluency in two specialized registers simultaneously, critical social theory and technical AI vocabulary, each with established terminology in English-language scholarship. The English text shows some syntactic awkwardness that raises questions about whether machine translation was used and on the accuracy of the terminology choices.

View Agreement report

Project 2

WIPO-GAI Patents
en-INTL → es-LATAM evaluation

This project evaluates a translated secion of the WIPO Patent Landscape Report on Generative AI, which covers the rapid growth of patent activity following the launch of ChatGPT, covering technological drivers and the expanding range of real-world applications.

Note: A target text was produced internally for the purposes of this project.

Full source text

Project-specific Specifications

Subdomains: Intellectual property, machine learning architectures, corporate R&D, patent analytics

Text type: Technical report

Purpose: Informative

Author perspective: Subject matter expert

Audience: Investors, developers, patent lawyers

Content complexities: High density of technical acronyms (GANs, VAEs, LLMs, MLLMs); proper nouns requiring no translation (model names, company names, product names); ranked list structures with enumerated claims; genre-mixing of legal/IP register with technology journalism register

View Agreement report

Project 3

Derechos de autor
es-MX → en-US evaluation

This text is legal commentary from a Mexican law firm examining how the Federal Law on Copyright (LFDA) fails to address AI-assisted authorship, with two proposed institutional reforms. We evaluated an adapted version of the original Spanish.

Note: We did not use the English translation provided within a PDF version of this content. Instead, a target text was produced internally for the purposes of this project.

Full source text

Project-specific Specifications

Subdomains: Intellectual property, Mexican law, AI and copyright

Text type: Legal commentary

Purpose: Informative

Author perspective: Legal professional

Audience: Government, Policy makers

Content complexities: The text blends legal, academic, and essayistic registers that must remain distinguishable throughout. Rhetorical question sequences function as cumulative discursive units rather than independent sentences. The Dark Forest Theory and Turing Test/TTR are introduced as analogies and then recalled as structural through-lines, requiring consistent terminology across distant passages to preserve their rhetorical weight.

Linguistic Assets

Overall Specifications

GAI Glossary

MQM Error Examples

Evaluation Projects

Ecosistema digitales-LATAM → en-US evaluation

Project-specific Specifications

WIPO-GAI Patentsen-INTL → es-LATAM evaluation

Project-specific Specifications

Derechos de autores-MX → en-US evaluation

Project-specific Specifications

Ecosistema digital
es-LATAM → en-US evaluation

WIPO-GAI Patents
en-INTL → es-LATAM evaluation

Derechos de autor
es-MX → en-US evaluation