Sökning: "Byte Pair Encoding"

Hittade 3 uppsatser innehållade orden Byte Pair Encoding.

  1. 1. Incremental Re-tokenization in BPE-trained SentencePiece Models

    Kandidat-uppsats, Umeå universitet/Institutionen för datavetenskap

    Författare :Simon Hellsten; [2024]
    Nyckelord :BPE; Byte Pair Encoding; SentencePiece; NLP; Natural Language Processing; Tokenization; Re-tokenization;

    Sammanfattning : This bachelor's thesis in Computer Science explores the efficiency of an incremental re-tokenization algorithm in the context of BPE-trained SentencePiece models used in natural language processing. The thesis begins by underscoring the critical role of tokenization in NLP, particularly highlighting the complexities introduced by modifications in tokenized text. LÄS MER

  2. 2. Question answering on introductory Java programming concepts using the Transformer

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Lukas Szerszen; [2021]
    Nyckelord :;

    Sammanfattning : AI applications for education could help students learn in their introductory programming courses. Many applications for education try to simulate a humantutoring session that engages the student in a dialogue. During the session, they can ask questions and have them answered while working throughan exercise. LÄS MER

  3. 3. Bidirectional LSTM-CNNs-CRF Models for POS Tagging

    Master-uppsats, Uppsala universitet/Institutionen för lingvistik och filologi

    Författare :Hao Tang; [2018]
    Nyckelord :bidirectional LSTM; part of speech; CNNs; CRF; byte pair encoding BPE ;

    Sammanfattning : In order to achieve state-of-the-art performance for part-of-speech(POS) tagging, the traditional systems require a significant amount of hand-crafted features and data pre-processing. In this thesis, we present a discriminative word embedding, character embedding and byte pair encoding (BPE) hybrid neural network architecture to implement a true end-to-end system without feature engineering and data pre-processing. LÄS MER