Sökning: "Entity Retrieval"

Visar resultat 1 - 5 av 7 uppsatser innehållade orden Entity Retrieval.

  1. 1. A lightweight deep learning architecture for text embedding : Comparison between the usage of Transformers and Mixers for textual embedding

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Corentin Royer; [2023]
    Nyckelord :Deep Learning; Entity Retrieval; Mixer; Transformer;

    Sammanfattning : Text embedding is a widely used method for comparing pieces of text together by mapping them to a compact vector space. One such application is deduplication which consists in finding textual records that refer to the same underlying idea in order to merge them or delete one of them. LÄS MER

  2. 2. Text and Speech Alignment Methods for Speech Translation Corpora Creation : Augmenting English LibriVox Recordings with Italian Textual Translations

    Master-uppsats, Uppsala universitet/Institutionen för lingvistik och filologi

    Författare :Giuseppe Della Corte; [2020]
    Nyckelord :speech translation; parallel corpora; bilingual sentence alignment; sentence embeddings; cosine similarity; forced alignment; text collection; corpora creation; audio signal processing;

    Sammanfattning : The recent uprise of end-to-end speech translation models requires a new generation of parallel corpora, composed of a large amount of source language speech utterances aligned with their target language textual translations. We hereby show a pipeline and a set of methods to collect hundreds of hours of English audio-book recordings and align them with their Italian textual translations, using exclusively public domain resources gathered semi-automatically from the web. LÄS MER

  3. 3. Anemone: a Visual Semantic Graph

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Joan Ficapal Vila; [2019]
    Nyckelord :Neo4j; Topic Modelling; Semantic Graph; Latent Dirichlet Allocation LDA ; NER; Sentence Reformulation.;

    Sammanfattning : Semantic graphs have been used for optimizing various natural language processing tasks as well as augmenting search and information retrieval tasks. In most cases these semantic graphs have been constructed through supervised machine learning methodologies that depend on manually curated ontologies such as Wikipedia or similar. LÄS MER

  4. 4. Weighting Edit Distance to Improve Spelling Correction in Music Entity Search

    Master-uppsats, KTH/Skolan för datavetenskap och kommunikation (CSC)

    Författare :Axel Samuelsson; [2017]
    Nyckelord :Spelling correction; edit distance; search; music; spotify; trie; Damerau; Levenshtein;

    Sammanfattning : This master’s thesis project undertook investigation of whether the extant Damerau- Levenshtein edit distance measurement between two strings could be made more useful for detecting and adjusting misspellings in a search query. The idea was to use the knowledge that many users type their queries using the QWERTY keyboard layout, and weighting the edit distance in a manner that makes it cheaper to correct misspellings caused by confusion of nearer keys. LÄS MER

  5. 5. Undersökande studie inom Information Extraction : Konsten att Klassicera

    Kandidat-uppsats, KTH/Skolan för datavetenskap och kommunikation (CSC)

    Författare :Erik Torstensson; Fredrik Carls; [2016]
    Nyckelord :Information Extraction; Named Entity Recognition; Java; Industrial Management; Information Extraction; Named Entity Recognition; Java; Industriell Ekonomi;

    Sammanfattning : Denna uppsats är en undersökande studie inom Information Extraction. Huvudsyftet är att skapa och utvärdera metoder inom Information Extraction och undersöka hur de kan hjälpa till att förbättra det vetenskapliga resultatet av klassificering av textelement. LÄS MER