Analyzing Error Types in English-Czech Machine Translation

Bojar, Ondřej

Repository landing page

research

oai:biblio.ufal.mff.cuni.cz:4111830274767570700

Analyzing Error Types in English-Czech Machine Translation

Authors: Ondřej Bojar
Publication date: 1 January 2011
Publisher
Doi

Abstract

This paper examines two techniques of manual evaluation that can be used to identify error types of individual machine translation systems. The first technique of “blind post-editing” is being used in WMT evaluation campaigns since 2009 and manually constructed data of this type are available for various language pairs. The second technique of explicit marking of errors has been used in the past as well. We propose a method for interpreting blind post-editing data at a finer level and compare the results with explicit marking of errors. While the human annotation of either of the techniques is not exactly reproducible (relatively low agreement), both techniques lead to similar observations of differences of the systems. Specifically, we are able to suggest which errors in MT output are easy and hard to correct with no access to the source, a situation experienced by users who do not understand the source language

info:eu-repo/semantics/article

Similar works

Full text

Open in the Core reader

Download PDF

Biblio at Institute of Formal and Applied Linguistics

oai:biblio.ufal.mff.cuni.cz:41...

Last time updated on 12/11/2016

This paper was published in Biblio at Institute of Formal and Applied Linguistics.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.