Exploring toxic lexicon similarity methods with the DRG framework on the toxic style transfer task

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: The topic of this thesis is the detoxification of language in social networks with a particular focus on style transfer techniques that combine deep learning and linguistic resources. In today’s digital landscape, social networks are rife with communication that can often be toxic, either intentionally or unintentionally. Given the pervasiveness of social media and the potential for toxic language to perpetuate negativity and polarization, this study addresses the problem of toxic language and its transformation into more neutral expressions. The importance of this issue is underscored by the need to promote non-toxic communication in the social networks that are an integral part of modern society. The complexity of natural language and the subtleties of what constitutes toxicity make this a challenging problem worthy of study. To address this problem, this research proposes two models, LexiconGST and MultiLexiconGST, developed based on the Delete&Generate framework. These models integrate linguistic resources into the detoxification system to guide deep learning techniques. Experimental results show that the proposed models perform commendably in the detoxification task compared to stateof-the-art methods. The integration of linguistic resources with deep learning techniques is confirmed to improve the performance of detoxification systems. Finally, this research has implications for social media platforms and online communities, which can now implement more effective moderation tools to promote non-toxic communication. It also opens lines of further research to generalize our proposed method to other text styles.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)