Smart Search Engine : A Design and Test of Intelligent Search of News with Classification

Detta är en Kandidat-uppsats från Högskolan Dalarna/Institutionen för information och teknik

Författare: Chaoyang Li; Ke Liu; [2021]

Nyckelord: Smart search; precision; recall rate; NLTK; inverted index; BM25;

Sammanfattning: Background Google, Bing, and Baidu are the most commonly used search engines in the world. They also have some problems. For example, when searching for Jaguar, most of the search  results are cars, not animals. This is the problem of polysemy. Search engines always provide the most popular but not the most correct results. Aim We want to design and implement a search function and explore whether the method of classified news can improve the precision of users searching for news. Method In this research, we collect data by using a web crawler. We use a web crawler to crawl    the data of news in BBC news. Then we use NLTK, inverted index to do data pre-processing, and use BM25 to do data processing. Results Compare to the normal search function, our  function has a lower recall rate and a higher precision. Conclusions This search function can improve the precision when people search for news. Implications This search function can be used not only to search news but to search everything. It has a great future in search engines. It can be combined with machine learning to analyze users' search habits to search and classify more accurately.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)