Kontext MT

  • Contact:

    Sai Arjun Koneru, Jan Niehues

  • Funding:

    Industrieproject

  • Partner:

    SAP

  • Startdate:

    01.03.2022

  • Enddate:

    28.02.2025

Although existing Machine Translation systems have achieved impressive performance on many language pairs, several challenges are yet to be solved. Domain-mismatch for training and test data, limited amounts of labelled in-domain data, specialised terminology, translating conversational content are few examples of open problems in the current MT research. In the SAP-KIT “Kontext-MT” project, we aim to solve some of the problems described above to improve software localisation for SAP products.

The core idea of the project is to use additional contextual information for improving the Neural MT (NMT) models. Majority of the current MT systems rely on only the source sentence to generate a target translation. However, such systems ignore information that is necessary in producing the accurate translation. For example, consider the word “driver” as the user-interface text we need to translate for a taxi and software application (app). In the taxi app, we should translate the text so that it means somebody who is driving the vehicle. In the software app, it should be translated to software drivers and not as mentioned before. However, we cannot know exactly how to translate with the source sentence alone and need additional context.

Furthermore, we will focus on using also other sources of contextual informations to improve the overall quality. We need to maintain consistency (same translation for source sentences that mean the same thing), translate in-domain special terms (use information dictionary/ additional resources) and follow the length restrictions (Use screenshots so that generated translation does not exceed the text box length but also of high-quality). Therefore, we hope to improve the current MT systems using context described above and enable high quality translations for SAP products.