Using clickstream data as implicit feedback in information retrieval systems

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: This Master's thesis project aims to investigate if Wikipedia's clickstream data can be used to improve the retrieval performance of information retrieval systems. The project is conducted under the assumption that a traversal between two article connects the two articles in regards to content. To extract useful terms out of the clickstream data, it needed to be structured so that it given a Wikipedia article it is possible to find all of the in-going or out-going article traversals.The project settled on using the clickstream data in an automatic query expansion approach.Two expansion methods were investigated, one based on expanding with full article title so that the context would be preserved, and the other expanded with individual terms from the article titles.The structure of the data and two proposed methods were evaluated using a set of queries and relevance judgments. The results of the evaluation shows that the method that expands with individual terms performed better than the full article title expansion method and that the individual term method managed to increase the MAP with 11.24%.  The expansion method was evaluated on two different query collections, and it was found that the proposed expansion method only improves the results where the average recall of the original queries are low.The thesis conclusion is that the clickstream can be used to improve retrieval performance for an information retrieval system.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)