Automatisk tesauruskonstruktion med latent semantisk indexering

Detta är en Magister-uppsats från Högskolan i Borås/Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan; Högskolan i Borås/Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan

Sammanfattning: The aim of this thesis is to examine how thesauri constructed with latent semantic indexing LSI are performing when used for query expansion. There is a well-known problem with synonymy in information retrieval and one solution to this problem is to use a thesaurus. In this thesis thesauri are created automatically to find statistically related words and not only synonyms. LSI is a method that uses singular value decomposition SVD to reduce dimensions in a matrix and find latent relationships between words. We constructed nine thesauri and used them for query expansion in a Swedish database, GP_HDINF. To evaluate the performance of the thesauri precision and recall were used. We found some interesting results in how the thesauri performed, even though the results from this study did not show improvements of the retrieval effectiveness when using the thesauri for query expansion. In this study it is interesting to notice that when the recall for a topic improved precision also improved or was unchanged.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)