Lucia Specia, Marco Turchi, Nicola Cancedda, Marc Dymetman, Nello Cristianini
Proceedings of the 13th Annual Conference of the EAMT, pages 28-35, Barcelona, May 2009.
We investigate the problem of predicting the quality of sentences produced by machine translation systems when reference translations are not available. The problem is addressed as a regression task and a method that takes into account the contribution of different features is proposed. We experiment with this method for translations produced by various MT systems and different language pairs, annotated with quality scores both automatically and manually. Results show that our method allows obtaining good estimates and that identifying a reduced set of relevant features plays an important role. The experiments
also highlight a number of outstanding features that were consistently selected
as the most relevant and could be used in different ways to improve MT performance or to enhance MT evaluation.
Report number: