Sökning: "multi-modal text"

Visar resultat 1 - 5 av 13 uppsatser innehållade orden multi-modal text.

  1. 1. Multi-modal Models for Product Similarity : Comparative evaluation of unimodal and multi-modal architectures for product similarity prediction and product retrieval

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Christos Frantzolas; [2023]
    Nyckelord :Computer Vision; Natural Language Processing; Representation Learning; Metric Learning; Multimodal Retrieval; Bildigenkänning; Språkteknologi; Representationsinlärning; Metrisk inlärning; Multimodal informationssökning;

    Sammanfattning : With the rapid growth of e-commerce, enabling effective product recommendation systems and improving product search for shoppers plays a crucial role in driving customer satisfaction. Traditional product retrieval approaches have mainly relied on unimodal models focusing on text data. LÄS MER

  2. 2. Artificial Neural Networks and Inductive Biases for Multi-Instance Multi-Modal Tabular Data : A Case Study for Default Probability Estimation in Small-to-Medium Enterprise Lending

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Gustav Röhss; [2022]
    Nyckelord :;

    Sammanfattning : The success of artificial neural networks in homogeneous data domains such as images, textual data, and audio and other signals has had considerable impact on Machine Learning and science in general. The domain of heterogeneous tabular data, while arguably much more common, remains much less explored with regards to artificial neural networks and deep learning. LÄS MER

  3. 3. A Study of Accumulation Times in Translation from Event Streams to Video for the Purpose of Lip Reading

    Kandidat-uppsats, KTH/Datavetenskap

    Författare :Didrik Munther; David Puustinen; [2022]
    Nyckelord :Lip-Reading; ANN; Event-based Cameras; Läppläsning; ANN; Eventbaserade kameror;

    Sammanfattning : Visually extracting textual context from lips consists of pattern matching which results in a frequent use of machine learning approaches for the task of classification. Previous research has consisted of mostly audiovisual (multi modal) approaches and conventional cameras. LÄS MER

  4. 4. VL Tasks: Which Models Suit? : Investigate Different Models for Swedish Image-Text Relation Task

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Meinan Gou; [2022]
    Nyckelord :BERT; Visual-Language; Language Understanding; Object Detection; Multimodality; BERT; Visual-Language; Språkförståelse; Objektdetektion; Multimodalitet;

    Sammanfattning : In common sense, modality measures the number of areas a model covers. Multi-modal or cross-modal models can handle two or more areas simultaneously. Some common cross-models include Vision-Language models, Speech-Language models, and Vision-Speech models. LÄS MER

  5. 5. Product Matching through Multimodal Image and Text Combined Similarity Matching

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :E Soon Ko; [2021]
    Nyckelord :Multimodal Machine Learning; Product Matching; Similarity Matching; Supervised Learning; Unsupervised Learning; Siamese network;

    Sammanfattning : Product matching in e-commerce is an area that faces more and more challenges with growth in the e-commerce marketplace as well as variation in the quality of data available online for each product. Product matching for e-commerce provides competitive possibilities for vendors and flexibility for customers by identifying identical products from different sources. LÄS MER