MQM History – MQM (Multidimensional Quality Metrics)

MQM was created as part of the European Union-funded QTLaunchPad and QT21 projects under the direction of the German Research Center for Artificial Intelligence (DFKI). Its structure was based on the LISA QA Model, which was widely implemented in the translation industry, but it has taken on a more modular approach. Researchers at DFKI examined a wide variety of translation evaluation metrics used in the language industry (such as the LISA QA Model, SAE J2450, and the ATA Framework for Error Marking) and various commercial and open-source tools (such as Trados Studio, ApSIC XBench, Okapi CheckMate, and Yamagata QA Distiller) to determine the error types that existed in various approaches. The goal was to create a system for designing custom metrics based on a comprehensive, shared typology of error types.

Later, TAUS and DFKI, working with LTAC Global, created a DQF:MQM, a harmonized error typology based on the first versions of MQM and the TAUS Dynamic Quality Framework (DQF). The DQF error typology became a subset of MQM. This has since been superseded by MQM-Core, which updates DQF:MQM and replaces it.

Currently, MQM is widely used in translation tools and research and implemented in quality processes. It is maintained by the MQM Council, which is working to ensure that various standards are consistent with MQM.

The original MQM-Full error typology from QT LaunchPad project can be accessed on the Wayback Machine.

Depending on the implementation context, implementers might need more granularity or less than is present in the MQM Core typology. If so, they can adapt the error typology to the level of granularity needed. A leaner set of error types might “roll up” the subtypes for a given dimension and mark all errors related to those subtypes simply under that high level dimension. Or, if an increased level of granularity is desired, implementers can turn to MQM-Full (the full error repository linked to above) to add the error subtypes that suit their purposes. For instance, instead of identifying all grammatical errors under the error type “grammar”, it could be desirable to identify Gender, Word form, Part-of-speech, Verb tense, and Agreement. When adapting the error typology in either direction, it is important to respect the hierarchical structure of MQM, for purposes of comparison.

The “Custom” dimension can be used for values not covered by MQM-Full and its subtypes. Or, for instance, if implementers want to annotate extra-segmental categories such as Cohesion and Coherence explicitly, they could be annotated here, with the understanding that these aspects can be orthogonal to the standard annotation.