Sökning: "document clustering"

Visar resultat 1 - 5 av 34 uppsatser innehållade orden document clustering.

  1. 1. Help Document Recommendation System

    Master-uppsats, Malmö universitet/Fakulteten för teknik och samhälle (TS)

    Författare :Keerthi Vijay Kumar; Pinky Mary Stanly; [2023]
    Nyckelord :Document similarity; Recommender systems; content-based filtering; collaborative filtering; Term Frequency-Inverse Document Frequency TF-IDF ; Bidirectional Encoder Representation from Transformers BERT ; Non-Negative Matrix Factorisation NMF ; cosine similarity; K-means clustering;

    Sammanfattning : Help documents are important in an organization to use the technology applications licensed from a vendor. Customers and internal employees frequently use and interact with the help documents section to use the applications and know about the new features and developments in them. LÄS MER

  2. 2. Descriptive Labeling of Document Clusters

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Adam Österberg; [2022]
    Nyckelord :Natural Language Processing; Wikipedia; Topic Modeling; Labeling; Språkteknologi; Wikipedia; Temamodellering; Märkning;

    Sammanfattning : Labeling is the process of giving a set of data a descriptive name. This thesis dealt with documents with no additional information and aimed at clustering them using topic modeling and labeling them using Wikipedia as a second source. Labeling documents is a new field with many potential solutions. LÄS MER

  3. 3. Anomaly Detection in Log Files Using Machine Learning Techniques

    Magister-uppsats, Blekinge Tekniska Högskola/Fakulteten för datavetenskaper

    Författare :Lakshmi Geethanjali Mandagondi; [2021]
    Nyckelord :Anomaly Detection; Log Files; Machine Learning; Clustering; Outlier Detection;

    Sammanfattning : Context: Log files are produced in most larger computer systems today which contain highly valuable information about the behavior of the system and thus they are consulted fairly often in order to analyze behavioral aspects of the system. Because of the very high number of log entries produced in some systems, it is however extremely difficult to seek out relevant information in these files. LÄS MER

  4. 4. Semantic Topic Modeling and Trend Analysis

    Master-uppsats, Linköpings universitet/Statistik och maskininlärning

    Författare :Jasleen Kaur Mann; [2021]
    Nyckelord :NLP; unsupervised topic modelling; trend analysis; LDA; BERT; Sentence-BERT; TF-IDF; transformer based language models; document clustering;

    Sammanfattning : This thesis focuses on finding an end-to-end unsupervised solution to solve a two-step problem of extracting semantically meaningful topics and trend analysis of these topics from a large temporal text corpus. To achieve this, the focus is on using the latest develop- ments in Natural Language Processing (NLP) related to pre-trained language models like Google’s Bidirectional Encoder Representations for Transformers (BERT) and other BERT based models. LÄS MER

  5. 5. Automated error matching system using machine learning and data clustering : Evaluating unsupervised learning methods for categorizing error types, capturing bugs, and detecting outliers.

    Master-uppsats, Linköpings universitet/Programvara och system

    Författare :Jonatan Bjurenfalk; August Johnson; [2021]
    Nyckelord :Unsupervised learning; machine learning; clustering; DBSCAN; HDBSCAN; X-Means; outlier detection; error log clustering;

    Sammanfattning : For large and complex software systems, it is a time-consuming process to manually inspect error logs produced from the test suites of such systems. Whether it is for identifyingabnormal faults, or finding bugs; it is a process that limits development progress, and requires experience. LÄS MER