Sökning: "corpora creation"

Visar resultat 1 - 5 av 6 uppsatser innehållade orden corpora creation.

  1. 1. IŻ SWÓJ JĘZYK MAJĄ! An exploration of the computational methods for identifying language variation in Polish

    Master-uppsats, Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteori

    Författare :Maria Irena Szawerna; [2023-06-19]
    Nyckelord :language variation; Polish; diachronic linguistics; part-of-speech tagging; lemmatization; corpus linguistics;

    Sammanfattning : Computational approaches to language variation continue to contribute in a relevant way to various fields, including Natural Language Processing (NLP) and linguistics. Being able to accommodate variation within natural language increases the robustness of NLP models and their usefulness in real-life applications; simultaneously, detecting and describing variation and trends that govern it is one of the main goals of sociolinguistics and historical linguistics, meaning that some of the advances in NLP can contribute to these fields as well. LÄS MER

  2. 2. Exploring Human-Robot Interaction Through Explainable AI Poetry Generation

    Master-uppsats, Mälardalens högskola/Akademin för innovation, design och teknik

    Författare :Philippe Strineholm; [2021]
    Nyckelord :explainable AI poetry generation human robot interaction HRI;

    Sammanfattning : As the field of Artificial Intelligence continues to evolve into a tool of societal impact, a need of breaking its initial boundaries as a computer science discipline arises to also include different humanistic fields. The work presented in this thesis revolves around the role that explainable artificial intelligence has in human-robot interaction through the study of poetry generators. LÄS MER

  3. 3. Lingvistisk analys av den ryska politiska anekdoten under 2000-talet: genreförändring?

    Kandidat-uppsats, Stockholms universitet/Slaviska språk

    Författare :Jana Hilding; [2020]
    Nyckelord :anecdote; humour; joke; linguistics; political; post-Soviet; Russian; anekdot; humor; lingvistik; postsovjetisk; politisk; ryska; skämt; vits;

    Sammanfattning : An analysis was made of the personal and non-personal post-Soviet political anecdotes. It is argued that post-Soviet political anecdotes have been created during 2000–2019. The starting point is that the Russian word анекдот carries particular, cultural information and does not have a systematic equivalent in Swedish ot English, i.e. LÄS MER

  4. 4. A comparative study of the grammatical gender systems of languages by means of analysing word embeddings

    Master-uppsats, Uppsala universitet/Institutionen för lingvistik och filologi

    Författare :Hartger Veeman; [2020]
    Nyckelord :word embeddings; grammatical gender; computational linguistics; language representations;

    Sammanfattning : The creation of word embeddings is one of the key breakthroughs in natural language processing. Word embeddings allow for words to be represented semantically, opening the way to many new deep learning methods. LÄS MER

  5. 5. Text and Speech Alignment Methods for Speech Translation Corpora Creation : Augmenting English LibriVox Recordings with Italian Textual Translations

    Master-uppsats, Uppsala universitet/Institutionen för lingvistik och filologi

    Författare :Giuseppe Della Corte; [2020]
    Nyckelord :speech translation; parallel corpora; bilingual sentence alignment; sentence embeddings; cosine similarity; forced alignment; text collection; corpora creation; audio signal processing;

    Sammanfattning : The recent uprise of end-to-end speech translation models requires a new generation of parallel corpora, composed of a large amount of source language speech utterances aligned with their target language textual translations. We hereby show a pipeline and a set of methods to collect hundreds of hours of English audio-book recordings and align them with their Italian textual translations, using exclusively public domain resources gathered semi-automatically from the web. LÄS MER