Unlearn with Your Contribution : A Machine Unlearning Framework in Federated Learning

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: Recent years have witnessed remarkable advancements in machine learning, but with these advances come concerns about data privacy. Machine learning inherently involves learning functions from data, and this process can potentially lead to information leakage through various attacks on the learned model. Additionally, the presence of malicious actors who may poison input data to manipulate the model has become a growing concern. Consequently, the ability to unlearn specific data samples on demand has become critically important. Federated Learning (FL) has emerged as a powerful approach to address these challenges. In FL, multiple participants or clients collaborate to train a single global machine learning model without sharing their training data. However, the issue of machine unlearning is particularly pertinent in FL, especially in scenarios where clients are not fully trustworthy. This paper delves into the investigation of the efficacy of solving machine unlearning problems within the FL framework. The central research question this work tackles is: How can we effectively unlearn the entire dataset from one or multiple clients once an FL training is completed, while maintaining privacy and without access to the data? To address this challenge, we introduce the concept of ”contribution,” which quantifies how much each client contributes to the training of the global FL model. In our implementation, we employ an Encoder-Decoder model on the server’s end to disentangle these contributions as the FL process progresses. Notably, our approach is unique in that there is no existing work that utilizes a similar concept nor similar models. Our findings, supported by extensive experiments on datasets MNIST and FashionMNIST, demonstrate that our proposed approach successfully solves the unlearning task in FL. Remarkably, it achieves results comparable to retraining from scratch without requiring the participation of the specific client whose data needs to be unlearned. Moreover, additional ablation studies indicate the sensitivity of the proposed model to specific structural hyperparameters.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)