Using XAI Tools to Detect Harmful Bias in ML Models

Detta är en Kandidat-uppsats från Umeå universitet/Institutionen för datavetenskap

Sammanfattning: In the past decade, machine learning (ML) models have become farmore powerful, and are increasingly being used in many important contexts. At the same time, ML models have become more complex, and harder to understand on their own, which has necessitated an interesting explainable AI (XAI), a field concerned with ensuring that ML and other AI system can be understood by human users and practitioners. One aspect of XAI is the development of ”explainers”, tools that take a more complex system (here: an ML model) and generate a simpler but sufficiently accurate model of this system — either globally or locally —to yield insight into the behaviour of the original system. As ML models have become more complex and prevalent, concerns that they may embody and perpetuate harmful social biases have also risen, with XAI being one proposed tool for bias detection. This paper investigates the ability of two explainers, LIME and SHAP, which explain the prediction of potentially more complex models by way of locally faithful linear models, to detect harmful social bias (here in the form of the influence of the racial makeup of a neighbourhood on property values), in a simple experiment involving two kinds of ML models, line arregression and an ensemble method, trained on the well-known Boston-housing dataset. The results show that LIME and SHAP appear to be helpful in bias detection, while also revealing an instance where the explanations do not quite reflect the workings of the model, while still yielding accurate insight into the predictions the model makes.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)