Sökning: "BM25"

Visar resultat 1 - 5 av 12 uppsatser innehållade ordet BM25.

  1. 1. Optimizing Search Engine Field Weights with Limited Data : Offline exploration of optimal field weight combinations through regression analysis

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Zino Kader; [2023]
    Nyckelord :Information retrieval; Search engines; BM25 Best Match 25 ; Regression analysis; Parameter estimation; Learning to rank; Informationsinhämtning; Sökmotorer; BM25 Best Match 25 ; Regressionsanalys; Parameterskattning; Maskininlärning för rangordning;

    Sammanfattning : Modern search engines, particularly those utilizing the BM25 ranking algorithm, offer a multitude of tunable parameters designed to refine search results. Among these parameters, the weight of each searchable field plays a crucial role in enhancing search outcomes. LÄS MER

  2. 2. Synthetic data generation for domain adaptation of a retriever-reader Question Answering system for the Telecom domain : Comparing dense embeddings with BM25 for Open Domain Question Answering

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Filip Döringer Kana; [2023]
    Nyckelord :Natural Language Processing; Transformers; Deep Learning; Question Answering; Data Generation; Språkteknologi; Transformers; Djupinlärning; Frågebesvaring; Datagenerering;

    Sammanfattning : Having computer systems capable of answering questions has been a goal within Natural Language Processing research for many years. Machine Learning systems have recently become increasingly proficient at this task with large language models obtaining state-of-the-art performance. LÄS MER

  3. 3. Re-ranking search results with KB-BERT

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Bjarki Viðar Kristjánsson; [2022]
    Nyckelord :Natural language processing; Information retrieval; BERT; KB-BERT; Search evaluation; Naturlig språkbehandling; Informationssökning; BERT; KB-BERT; Sökutvärdering;

    Sammanfattning : This master thesis aims to determine if a Swedish BERT model can improve a BM25 search by re-ranking the top search results. We compared a standard BM25 search algorithm with a more complex algorithm composed of a BM25 search followed by re-ranking the top 10 results by a BERT model. LÄS MER

  4. 4. Duplicate detection of multimodal and domain-specific trouble reports when having few samples : An evaluation of models using natural language processing, machine learning, and Siamese networks pre-trained on automatically labeled data

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Viktor Karlstrand; [2022]
    Nyckelord :Duplicate detection; Bug reports; Trouble reports; Natural language processing; Information retrieval; Machine learning; Siamese neural network; Transformers; Automated data labeling; Shapley values; Dubblettdetektering; Felrapporter; Buggrapporter; Naturlig språkbehandling; Informationssökning; Maskininlärning; Siamesiska neurala nätverk; Transformatorer; Automatiserad datamärkning; Shapley-värden;

    Sammanfattning : Trouble and bug reports are essential in software maintenance and for identifying faults—a challenging and time-consuming task. In cases when the fault and reports are similar or identical to previous and already resolved ones, the effort can be reduced significantly making the prospect of automatically detecting duplicates very compelling. LÄS MER

  5. 5. Integrating Telecommunications-Specific Language Models into a Trouble Report Retrieval Approach

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Nathan Bosch; [2022]
    Nyckelord :information retrieval; neural ranking; trouble reports; log analysis; natural language processing; informationssökning; neural rangordning; felrapporter; logganalys; naturlig språkbehandling;

    Sammanfattning : In the development of large telecommunications systems, it is imperative to identify, report, analyze and, thereafter, resolve both software and hardware faults. This resolution process often relies on written trouble reports (TRs), that contain information about the observed fault and, after analysis, information about why the fault occurred and the decision to resolve the fault. LÄS MER