Sökning: "förlustfunktioner"

Visar resultat 1 - 5 av 8 uppsatser innehållade ordet förlustfunktioner.

  1. 1. Modulating Depth Map Features to Estimate 3D Human Pose via Multi-Task Variational Autoencoders

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Kobe Moerman; [2023]
    Nyckelord :3D pose estimation; Joint landmarks; Variational autoencoder; Multi-task model; Loss discrimination; Latent-space modulation; Depth map; 3D-positionsuppskattning; Gemensamma landmärken; Variationell autoencoder; Multitask-modell; Förlustdiskriminering; Latent-space-modulering; Djupkarta;

    Sammanfattning : Human pose estimation (HPE) constitutes a fundamental problem within the domain of computer vision, finding applications in diverse fields like motion analysis and human-computer interaction. This paper introduces innovative methodologies aimed at enhancing the accuracy and robustness of 3D joint estimation. LÄS MER

  2. 2. Image Colorization Based on Deep Learning

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Tao Deng; [2023]
    Nyckelord :Image colorization; Deep Learning; Convolutional Neural Network; Generative Adversarial Network; Färgläggning av bilder; djupinlärning; Konvolutionella Neurala Nätverk; Generativa Adversariella Nätverk;

    Sammanfattning : With the development of artificial intelligence, there is a clear trend to combine computer technology with traditional industries. In recent years, with the development of digital media technology, many methods for coloring gray-scale images have been proposed. LÄS MER

  3. 3. A real-time Multi-modal fusion model for visible and infrared images : A light-weight and real-time CNN-based fusion model for visible and infrared images in surveillance

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Jin Wanqi; [2023]
    Nyckelord :Image fusion; deep learning; surveillance; CNN; real time; Bildfusion; djupinlärning; övervakning; CNN; realtid;

    Sammanfattning : Infrared images could highlight the semantic areas like pedestrians and be robust to luminance changes, while visible images provide abundant background details and good visual effects. Multi-modal image fusion for surveillance application aims to generate an informative fused images from two source images real-time, so as to facilitate surveillance observatory or object detection tasks. LÄS MER

  4. 4. Attribute Embedding for Variational Auto-Encoders : Regularization derived from triplet loss

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Anton E. L. Dahlin; [2022]
    Nyckelord :Variational Auto-Encoder; Triplet Loss; Contrastive Loss; Generative Models; Metric Learning; Latent Space; Attribute Manipulation; Variationsautokodare; Triplettförlust; Kontrastiv Förlust; Generativa Modeller; Metrisk Inlärning; Latent Utrymme; Attributmanipulation;

    Sammanfattning : Techniques for imposing a structure on the latent space of neural networks have seen much development in recent years. Clustering techniques used for classification have been used to great success, and with this work we hope to bridge the gap between contrastive losses and Generative models. LÄS MER

  5. 5. VL Tasks: Which Models Suit? : Investigate Different Models for Swedish Image-Text Relation Task

    Master-uppsats, KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Författare :Meinan Gou; [2022]
    Nyckelord :BERT; Visual-Language; Language Understanding; Object Detection; Multimodality; BERT; Visual-Language; Språkförståelse; Objektdetektion; Multimodalitet;

    Sammanfattning : In common sense, modality measures the number of areas a model covers. Multi-modal or cross-modal models can handle two or more areas simultaneously. Some common cross-models include Vision-Language models, Speech-Language models, and Vision-Speech models. LÄS MER