Visualization and analysis of object states using diffusion models and PyTorch

Detta är en Kandidat-uppsats från Mälardalens universitet/Akademin för innovation, design och teknik

Författare: Christopher Nyberg; [2024]

Nyckelord: ;

Sammanfattning: Artificial Intelligence (AI) is an extremely rapidly growing field in modern technology. As the applications of AI expand, the ability to accurately analyze and predict the condition of various objects through various models has profound implications across numerous industries. This thesis introduces a pipeline that combines object recognition with generative image modeling to evaluate and visualize object conditions across various stages. The primary goal is to showcase how a diffusion model can be used to forecast object conditions by generating new images using image-to-image generation.   The initial stage of this project consists of a PyTorch model for object detection, assisted by a GPT-based vision model for condition assessment. Following this step is image generation with help of the diffusion model Stable Diffusion. This project aims to showcase how these methods can be combined and applied to various types of objects to analyze both current states and visualize potential future states. Practical applications of these methods are vast, ranging from predicting wear in industrial components to assisting in restoration of art.    The main goal has been to demonstrate the current potential of state-of-the-art image generation techniques. The results are promising, showing the vast potential of diffusion models when it comes to altering object states visually. Our trained LoRA models are capable of generating objects at various stages with various anomalies, and the PyTorch model is capable of object recognition through images.    The hazelnut LoRA model achieves 100% accuracy when generating the “crack” anomaly and 40% accuracy when generating the “hole” anomaly. The metal nut LoRA has poorer performance, with 20% accuracy when generating the “scratch” anomaly and 0% accuracy when generating the “discolored” anomaly. The PyTorch model achieves 100% validation accuracy on its image classification task, mainly due to its simplicity. A basic pipeline is also built that connects the various models together. Potential looks promising for improving these models and scaling this proof of concept into a full-fledged prediction and visualization tool, capable of significant contributions to various fields.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)