Auditing the Moral Bounds of Al Systems : An Implementation of the Glass-box's Observation Stage

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: Automated and assisted decision-making has become prevalent across a myriad of domains, often including sensitive and critical tasks where guarantees regarding the operation of artificially intelligent systems from an ethical standpoint become essential. Different parties have made efforts to develop guidelines to establish general ethical requirements these systems should comply with. The translation from moral values to norms and then into precise system requirements is not trivial. The Glass-box framework is an approach meant to address the challenge of auditing autonomous systems' adherence to ethical values. It offers a two-stage process: an interpretation stage, where translation from ethical values into system requirements is performed; and an observation stage, where the adherence of an autonomous system to the desired values is tested using the system's inputs and outputs. The Glass-box approach allows for great flexibility in implementation, and its disregard for the inner mechanisms of the observed systems enable its application over a wide range of contexts, however its concrete practical implementation can be challenging. Prior work has addressed the formalisation of the Glass-box; covering the logical implementation of the reasoning involved in both the interpretation and observation stages. Yet, implementing the testing mechanisms required to translate input-output pairs into logical statements within the observation stage, remains an uncharted territory.  This thesis presents an implementation of the Glass-box's observation stage, considering also its further extension to not only audit the system under observation, but also to intervene it when adherence to the relevant moral bounds is not achieved. By making use of Bayesian generalized linear models, propositional logic, and formal argumentation, an implementation capable of handling a relevant class of scenarios in the audit of autonomous systems is presented; showcasing the generality-granularity trade-offs, the challenges of translating input-output pairs into logical statements, and the extension of the Glass-box approach to handle intervention via human-on-the­loop approaches. The implementation is validated through the case study of auditing a binary classifier's adherence to the value of fairness in the context of predicting criminal recidivism. The necessary loss of generality of the Glass-box framework to allow for its practical implementation is discussed, and directions for future work are proposed.  

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)