TraMOOC is currently present at the 20th Annual Conference of the European Association for Machine Translation (EAMT 2017) which is held at Prague, Czech Republic from the 29th till the 31st of May presenting a poster entitled "TraMOOC - Translation for Massive Open Online Courses: Recent Developments in Machine Translation", a presentation entitled "Comparing language related issues for NMT and PBMT between German and English" and a presentation entitled "Is Neural Machine Translation the New State-of-the-Art?".
The TraMOOC poster presentation gives an overview of the project's aims and objectives and progress and achievements. In TraMOOC, we have developed machine translation prototypes for 11 target languages, from English into German, Italian, Portuguese, Dutch, Bulgarian, Greek, Polish, Czech, Croatian, Russian and Chinese. The consortium has developed MT prototypes for 11 target languages, from English into German, Italian, Portuguese, Dutch, Bulgarian, Greek, Polish, Czech, Croatian, Russian, and Chinese. The translation systems are based on phrase-based SMT and Neural MT. The latter has achieved state-of-the-art performance in recent evaluation campaigns (Bojar, 2016). The Nematus toolkit (Sennrich, 2017) has been used for training; the translation server is based on the amuNMT toolkit (Junczys-Dowmunt et al., 2016). The translation systems have been adapted to MOOC texts via fine-tuning of the model parameters on in-domain training data to maximize translation quality on this domain. A comparative human evaluation of phrase-based SMT and NMT has been completed for four language pairs to compare educational domain output from both systems using a variety of metrics. These include automatic evaluation, human rankings of adequacy and fluency, error-type markup, and technical and temporal post-editing effort. The results show a preference for NMT in side-by side ranking for all language pairs, texts, and segment lengths. In addition, perceived fluency is improved and annotated errors are fewer in the NMT output. However, results are mixed for some error categories. Despite far fewer segments requiring post-editing, document-level post-editing performance was not found to have significantly improved when using NMT in this study, suggesting that NMT may not show an enormous improvement over SMT when used in a production scenario. Data and a slightly amended quality evaluation methodology have been subsequently prepared to apply to all TraMOOC NMT systems later in 2017.
The TraMOOC related presentation entitled "Comparing language related issues for NMT and PBMT between German and English" presents an extensive comparison of language related problems for neural machine translation and phrase-based machine translation between German and English. The explored issues are related both to the language characteristics as well as to the machine translation process and, although related, are going beyond typical translation error classes. It is shown that the main advantage of the NMT system consists of better handling of verbs, English noun collocations, German compound words, phrase structure as well as articles. In addition, it is shown that the main obstacles for the NMT system are prepositions, translation of English (source) ambiguous words and generating English (target) continuous tenses. Although in total there are less issues for the NMT system than for the PBMT system, many of them are complementary - only about one third of the sentences deals with the same issues, and for about 40% of the sentences the issues are completely different. This means that combination/hybridisation of the NMT and PBMT approaches is a promising direction for improving both types of systems.
The TraMOOC related presentation entitled "Is Neural Machine Translation the New State-of-the-Art?" discusses neural machine translation (NMT), a new paradigm in the MT field,comparing the quality of NMT systems with statistical MT by describing three studies using automatic and human evaluation methods. Automatic evaluation results presented for NMT are very promising, however human evaluations show mixed results. The authors report increases influency but inconsistent results for adequacy and post-editing effort. NMT undoubtedly represents a step forward for the MT field, but one that the community should be careful not to oversell.