Language Resources and Tools for the Automatic Analysis of Gender and Hate Speech in Italian: A Contribution of the Italian Computational Linguistics Community.

Caselli, T., Novielli, N., Patti, V. & Rosso, P., 2018.

Research output: Contribution to conferenceAbstractAcademic

The Italian Computational Linguistics (CL) community is involved in the development of automatic tools for detecting and monitoring hate speech online, and gender-related issues. This interest is reflected in the creation of annotated resources and the organisation of evaluation exercises on different languages. In particular, the EVALITA 2018 campaign, a bi-annual event of the Associazione Italiana di Linguistica Computazionale (AILC) for the evaluation of automatic systems for Italian, features three tasks with a specific focus on these aspects:

● GXG : Cross-Genre Gender Prediction (F. Dell'Orletta, M. Nissim) - This task targets an investigation on the (presumed) differences in writing based on gender by considering different textual genres. The cross-genre aspect plays a central role in investigating gender issues by looking for features that go beyond well-known lexical stereotypes. It is the first time that this task is organised for Italian.
● HaSpeeDe : Hate Speech Detection (C. Bosco, F. Dell'Orletta, M. Sanguinetti, F. Poletto, M. Stranisci, M. Tesconi) - Social media messages (Facebook posts and tweets), are at the core of this task. The task aims at identifying if a message carries hateful content or not. The challenging aspect is the cross-platform setting. This opens up to an investigation on whether hateful content is conveyed in different ways depending on the medium used.
● AMI : Automatic Misogyny Identification (M. Anzovino, E. Fersini, P. Rosso) - Misogyny is a special case of hate speech. AMI targets misogyny in tweets for the first time. The task is not limited to a binary classification, i.e., whether a message has misogynistic content or not, but it challenges systems to provide a classification of the misogynistic behaviour as well as of the target. The misogynistic behaviour is declined in five classes: stereotype and objectification; dominance; derailing; sexual harassment and threats of violence; discredit. The target distinguishes between active, if the offensive message is referring to a specific target, and passive, if it is addressed to many potential receivers. The task allows for cross-lingual comparison with English and Spanish (the same task has been organised in the context of IberEval 2018 ).

The datasets of Evalita 2018 are already freely available and they can be re-used to identify empirical evidence to theoretical frameworks or enriched with new information (multi-layered annotations).

In our contribution, we will illustrate such tasks, hopefully providing a setting for discussion on how to integrate them in the more theoretical linguistic context on the one hand, and on the applicative side on the other. We will also present actual results on the systems' performance, and contextualise the EVALITA tasks in the international panorama by relating them to a new shared task that will be organised at Semeval 2019 on Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter (HatEval, C. Bosco, E. Fersini, V. Patti, P. Rosso).

As organisers and chairs of EVALITA 2018, and as representatives of the Italian CL community, we deem crucial a continuous interaction and collaboration with related communities and institutions, and we are looking forward to being part of this event.
Original languageEnglish
Publication statusPublished - 2018
EventLInguaggio, parità di Genere e parole d'odio Language, Gender and HaTe Speech - University of Venice, Venice, Italy
Duration: 18-Oct-201819-Oct-2018


ConferenceLInguaggio, parità di Genere e parole d'odio Language, Gender and HaTe Speech
Abbreviated titleLIGHTS 2018
Internet address


LInguaggio, parità di Genere e parole d'odio Language, Gender and HaTe Speech


Venice, Italy

Event: Conference

View graph of relations

ID: 112792667