Deep convolution neural network for attention decoding in multi-channel EEG with conditional variational autoencoder for data augmentation

Detta är en Uppsats för yrkesexamina på avancerad nivå från Lunds universitet/Institutionen för reglerteknik

Författare: M Asjid Tanveer; [2023]

Nyckelord: Technology and Engineering;

Sammanfattning: Objectives: This project aims to develop a deep learning-based attention decoding system that can distinguish between noise and speech in noise and also identify the direction of attended speech from the brain data recorded with electroencephalography (EEG) instruments. Two deep convolutional neural network (DCNN) models will be designed: (1) one DCNN model capable of classifying incoming segments of sound as speech or speech in background noise, and (2) one DCNN model identifying the direction (left vs. right) of incoming attended speech. In addition, two conditional variational autoencoders (CVAEs) will be trained to generate artificial data for data augmentation, with the goal of improving the performance of the final models by learning from a latent space of training data to generate unique data for each respective class. Design: The proposed methods will be tested on a data set of 32 participants who performed an auditory attention task. Participants were instructed to attend to one of two talkers in the front and ignore the talker on the other side and background noise behind them, while high-density EEG was recorded. The EEG data consists of 66 channels in total, and all channels will be used in this study. Main Results: The DCNN models achieved accuracy (ACC) of 69.9%, 84.9%, and area under the curve (AUC) scores of 77.5%, 92.3% on the two tasks mentioned in the objectives. With augmented data from the CVAE model, the performance of the DCNN models improved to ACC of 70.5%, 86.6% and AUC of 78.3%, 93.6%, respectively. The time window used for the EEG data was 1 second, enabling the models to work in real-time situations. The CVAE model was able to generate data for the given classes effectively, with generated data from the test data latent space showing promising results. Conclusion: The findings of this study demonstrate the high capability of the proposed DCNN models in accurately detecting the direction of incoming speech and differentiating between noise and speech-in-noise, even with a small time window of just 1 second using multi-channel EEG data. Moreover, the results highlight the success of the CVAE model as a valuable tool for data augmentation, generating synthetic data that closely approximates the latent space information of the training data. This suggests the potential of CVAE for improving the performance of deep learning models in EEG-based attention decoding tasks.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)