TY - GEN
T1 - Learning from Outcomes
T2 - 60th IEEE Annual Symposium on Foundations of Computer Science (FOCS)
AU - Dwork, Cynthia
AU - Kim, Michael P.
AU - Reingold, Omer
AU - Rothblum, Guy N.
AU - Yona, Gal
N1 - Publisher Copyright: © 2019 IEEE.
PY - 2019/11
Y1 - 2019/11
N2 - Many selection procedures involve ordering candidates according to their qualifications. For example, a university might order applicants according to a perceived probability of graduation within four years, and then select the top 1000 applicants. In this work, we address the problem of ranking members of a population according to their 'probability' of success, based on a training set of historical binary outcome data (e.g., graduated in four years or not). We show how to obtain rankings that satisfy a number of desirable accuracy and fairness criteria, despite the coarseness of the training data. As the task of ranking is global (the rank of every individual depends not only on their own qualifications, but also on every other individuals' qualifications) ranking is more subtle and vulnerable to manipulation than standard prediction tasks. Towards mitigating unfair discrimination caused by inaccuracies in rankings, we develop two parallel definitions of evidence-based rankings. The first definition relies on a semantic notion of domination-compatibility: if the training data suggest that members of a set S are more qualified (on average) than the members of T, then a ranking that favors T over S (i.e. where T dominates S) is blatantly inconsistent with the evidence, and likely to be discriminatory. The definition asks for domination-compatibility, not just for a pair of sets, but rather for every pair of sets from a rich collection C of subpopulations. The second definition aims at precluding even more general forms of discrimination; this notion of evidence-consistency requires that the ranking must be justified on the basis of consistency with the expectations for every set in the collection C. Somewhat surprisingly, while evidence-consistency is a strictly stronger notion than domination-compatibility when the collection C is predefined, the two notions are equivalent when the collection C may depend on the ranking in question.
AB - Many selection procedures involve ordering candidates according to their qualifications. For example, a university might order applicants according to a perceived probability of graduation within four years, and then select the top 1000 applicants. In this work, we address the problem of ranking members of a population according to their 'probability' of success, based on a training set of historical binary outcome data (e.g., graduated in four years or not). We show how to obtain rankings that satisfy a number of desirable accuracy and fairness criteria, despite the coarseness of the training data. As the task of ranking is global (the rank of every individual depends not only on their own qualifications, but also on every other individuals' qualifications) ranking is more subtle and vulnerable to manipulation than standard prediction tasks. Towards mitigating unfair discrimination caused by inaccuracies in rankings, we develop two parallel definitions of evidence-based rankings. The first definition relies on a semantic notion of domination-compatibility: if the training data suggest that members of a set S are more qualified (on average) than the members of T, then a ranking that favors T over S (i.e. where T dominates S) is blatantly inconsistent with the evidence, and likely to be discriminatory. The definition asks for domination-compatibility, not just for a pair of sets, but rather for every pair of sets from a rich collection C of subpopulations. The second definition aims at precluding even more general forms of discrimination; this notion of evidence-consistency requires that the ranking must be justified on the basis of consistency with the expectations for every set in the collection C. Somewhat surprisingly, while evidence-consistency is a strictly stronger notion than domination-compatibility when the collection C is predefined, the two notions are equivalent when the collection C may depend on the ranking in question.
UR - http://www.scopus.com/inward/record.url?scp=85078426707&partnerID=8YFLogxK
U2 - 10.1109/FOCS.2019.00016
DO - 10.1109/FOCS.2019.00016
M3 - منشور من مؤتمر
SN - 9781728149530
T3 - Annual IEEE Symposium on Foundations of Computer Science
SP - 106
EP - 125
BT - Proceedings - 2019 IEEE 60th Annual Symposium on Foundations of Computer Science, FOCS 2019
PB - IEEE Computer Society
Y2 - 9 November 2019 through 12 November 2019
ER -