Publication

MoNoise: A Multi-lingual and Easy-to-use Lexical Normalization Tool

van der Goot, R., 28-Jul-2019.

Research output: Contribution to conferencePaperAcademic

Copy link to clipboard

Documents

  • MoNoise: A Multi-lingual and Easy-to-use Lexical Normalization Tool

    Final publisher's version, 194 KB, PDF-document

    Request copy

  • Rob van der Goot
In this paper, we introduce and demonstrate the online demo as well as the command line interface of a lexical normalization system (MoNoise) for a variety of languages. We further improve this model by using features from the original word for every normalization candidate. For comparison with future work,
we propose the bundling of seven datasets in six languages to form a new benchmark, together with a novel evaluation metric which is particularly suitable for cross-dataset comparisons. MoNoise reaches a new state-of-art performance for six out of seven of these datasets. Furthermore, we allow the user to tune the ‘aggressiveness’ of the normalization, and show how the model can be made more efficient with only a small loss in performance. The online demo can be found on: http://www.robvandergoot.com/monoise and the corresponding code on: https://bitbucket.org/robvanderg/monoise/
Original languageEnglish
Publication statusPublished - 28-Jul-2019
Event57th Annual Meeting of the Association for Computational Linguistics (ACL) - Florence, Italy
Duration: 28-Jul-20192-Aug-2019

Conference

Conference57th Annual Meeting of the Association for Computational Linguistics (ACL)
CountryItaly
CityFlorence
Period28/07/201902/08/2019

Event

57th Annual Meeting of the Association for Computational Linguistics (ACL)

28/07/201902/08/2019

Florence, Italy

Event: Conference

View graph of relations

ID: 86484010