Sökning: "multimodal input"

Visar resultat 1 - 5 av 26 uppsatser innehållade orden multimodal input.

  1. 1. Where to Fuse

    Master-uppsats, Lunds universitet/Matematisk statistik

    Författare :Lukas Petersson; [2024]
    Nyckelord :Technology and Engineering;

    Sammanfattning : This thesis investigates fusion techniques in multimodal transformer models, focusing on enhancing the capabilities of large language models in understanding not just text, but also other modalities like images, audio, and sensor data. The study compares late fusion (concatenating modality tokens after separate encoding) and early fusion (concatenating before encoding) techniques, examining their respective advantages and disadvantages. LÄS MER

  2. 2. Automated Interpretation of Lung Ultrasound for COVID-19 and Tuberculosis diagnosis

    Master-uppsats, Lunds universitet/Matematik LTH

    Författare :Chloé Soormally; [2023]
    Nyckelord :Tuberculosis; COVID-19; Lung Ultrasound; Computer-aided detection CAD ; Deep learning; Technology and Engineering;

    Sammanfattning : BACKGROUND. Early and accurate detection of infectious respiratory diseases like COVID-19 and tuberculosis (TB) plays a crucial role in effective management and the reduction of preventable mortality. LÄS MER

  3. 3. Real-time visual feedback of emotional expression in singing

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Xuehua Fu; [2023]
    Nyckelord :Sound and Music Computing; Music Visualization; Cross-modal Feedback; Mapping; Music Expressivity; Multimodal Interaction; Ljud-och musikberäkning; Musikvisualisering; Korsmodal återkoppling; Kartläggning; Musikexpressivitet; Multimodal interaktion;

    Sammanfattning : The thesis project concerns the development and evaluation of a real-time music visualization system aimed at creating a multi-modal perceptual experience of music emotions. The purpose of the project is to provide singers with real-time visual feedback on their singing, to enhance their expression of emotions in the music. LÄS MER

  4. 4. Text-Driven Fashion Image Manipulation with GANs : A case study in full-body human image manipulation in fashion

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Reza Dadfar; [2023]
    Nyckelord :Multimodal fashion image editing; Generative adversarial network inversion; Text-driven image manipulation; TD-GEM; Multimodal modebildredigering; Generativa adverserial Nätverk inversion; Text-driven bildmanipulation; TD-GEM;

    Sammanfattning : Language-based fashion image editing has promising applications in design, sustainability, and art. However, it is considered a challenging problem in computer vision and graphics. The diversity of human poses and the complexity of clothing shapes and textures make the editing problem difficult. LÄS MER

  5. 5. Playstyle Generation with Multimodal Generative Adversarial Imitation Learning : Style-reward from Human Demonstration for Playtesting Agents

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :William Ahlberg; [2023]
    Nyckelord :Imitation Learning; Reinforcement Learning; Game-testing; Imitationsinlärning; Förstärkande inlärning; Speltestning;

    Sammanfattning : Playtesting plays a crucial role in video game production. The presence of gameplay issues and faulty design choices can be of great detriment to the overall player experience. LÄS MER