Publication

Neural Machine Translation for English–Kazakh with Morphological Segmentation and Synthetic Data

Toral Ruiz, A., Edman, L., Spenader, J. & Yeshmagambetova, G., 1-Aug-2019, Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1). Forence, Italy: Association for Computational Linguistics (ACL), Vol. 2. p. 386-392 7 p.

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

This paper presents the systems submitted by the University of Groningen to the English-Kazakh language pair (both translation directions) for the WMT 2019 news translation task. We explore the potential benefits of (i) morphological segmentation (both unsupervised and rule-based), given the agglutinative nature of Kazakh, (ii) data from two additional languages (Turkish and Russian), given the scarcity of English-Kazakh data and (iii) synthetic data, both for the source and for the target language. Our best sub- missions ranked second for Kazakh-English and third for English-Kazakh in terms of the BLEU automatic evaluation metric.
Original languageEnglish
Title of host publicationProceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
Place of PublicationForence, Italy
PublisherAssociation for Computational Linguistics (ACL)
Pages386-392
Number of pages7
Volume2
Publication statusPublished - 1-Aug-2019

Download statistics

No data available

ID: 95754758