Re-ranking search results with KB-BERT

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Författare: Bjarki Viðar Kristjánsson; [2022]

Nyckelord: Natural language processing; Information retrieval; BERT; KB-BERT; Search evaluation; Naturlig språkbehandling; Informationssökning; BERT; KB-BERT; Sökutvärdering;

Sammanfattning: This master thesis aims to determine if a Swedish BERT model can improve a BM25 search by re-ranking the top search results. We compared a standard BM25 search algorithm with a more complex algorithm composed of a BM25 search followed by re-ranking the top 10 results by a BERT model. The BERT model used is KB-BERT, a publicly available neural network model built by the National Library of Sweden. We fine-tuned this model to solve the specific task of evaluating the relevancy of search results. A new Swedish search evaluation dataset was automatically generated from Wikipedia text to compare the algorithms. The search evaluation dataset is a standalone product and can be beneficial for evaluating other search algorithms on Swedish text in the future. The comparison of the two algorithms resulted in a slightly better ranking for the BERT re-ranking algorithm. These results align with similar studies using an English BERT and an English search evaluation dataset.

HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)

Re-ranking search results with KB-BERT

Sökningar just nu

Populära sökningar

Uppsatser med många visningar igår (2024-04-26)