关键词:
Artificial intelligence
Computer science
Electrical engineering
摘要:
Machine learning is entering every aspect of our lives, including high-stakes applications that directly affect people's lives, such as, hiring, education, lending, or healthcare. While these machine learning models are undoubtedly great at learning all patterns present in the data, blindly learning all patterns can often have unintended consequences, such as propagating biases with respect to gender, race, etc. These biases can adversely affect people’s lives, and also violate anti-discrimination laws, e.g., Title VII of the US Civil Rights Act. When it comes to resolving legal disputes, or even informing policies and interventions, only identifying bias and disparity in a model’s decision is not always sufficient. We really need to dig deeper and be able to identify/explain the sources of disparity. For example, disparities in hiring that can be explained by an occupational necessity may be exempt by law, e.g., code-writing skills for hiring a software engineer for a safety-critical application. However, the disparity arising due to an aptitude test may not be exempt, as in the landmark court case of Griggs v. Duke Power ‘71. This leads us to a question that bridges the fields of fairness, explainability, and law: How can we identify and explain the sources of disparity in machine learning models, e.g., did the disparity arise solely due to the critical occupational necessities? In this dissertation, I propose a systematic measure of "non-exempt disparity,'' i.e., the disparity which cannot be accounted for by the occupational necessities. To arrive at a measure for the non-exempt disparity, I adopt a rigorous axiomatic approach that brings together concepts in information theory, in particular, an emerging body of work called Partial Information Decomposition, with Pearl's causality. This dissertation also examines an extension of this technique to quantifying contribution of each individual feature to the observed disparity, a novel form of explainability. Lastl