A lightweight deep learning architecture for text embedding : Comparison between the usage of Transformers and Mixers for textual embedding

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Författare: Corentin Royer; [2023]

Nyckelord: Deep Learning; Entity Retrieval; Mixer; Transformer;

Sammanfattning: Text embedding is a widely used method for comparing pieces of text together by mapping them to a compact vector space. One such application is deduplication which consists in finding textual records that refer to the same underlying idea in order to merge them or delete one of them. The current state of the art in this domain uses the Transformer architecture trained on a large corpus of text. In this work, we evaluate the performance of a recently proposed architecture: the Mixer. It offers two key advantages, its parameter count scale linearly with the context window and it is built of simple MLP blocks that benefit from hardware acceleration. We found a 26% increase in performance when using the Mixer compared to the Transformer for a model of similar size.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)