AI Ethics: Algorithm and Data Biases

  • Posted on: 31 October 2018
  • By: Juho Vaiste

Algorithm and data biases can cause discrimination and unfair decisions. The transparency of algorithmic decision-making is an essential area of ethical AI. Algorithms and data can easily contain biases whose origin is in human decision-making, and these biases are difficult to detect, perceive and fix. (Campolo, Sanfilippo, Whittaker & Crawford, 2017). 

There are many well-known examples of problematic machine learning projects and experiments. Microsoft published its own chatbot Tay in March 2016, but the experiment took a wrong path only in hours. The chatbot operating in Twitter turned out to be easily manipulated by the interaction, and other Twitter users were able to teach Tay to behave racially and sexually-charged. 

Google instead used a machine learning solution to categorize and tag photos in its photo app. The machine learning solution started to make unintended mistakes by labeling black people as gorillas (Barr, 2015). Still, in 2018, it seems that Google has not fixed the problem, just taking tag "gorilla" off from use (Wired, 2018). 

Both experiments and machine learning solutions were shut down as soon as possible. Unfortunate and incorrect mistakes were nasty to their owners, but they also teach the world a valuable lesson about potential problems in the machine and deep learning solutions.  Algorithms and datasets are used for decision-making in recruiting, loan and mortgage applications and legal processes. All humans should be treated equally, but the historical data or the built algorithm can contain biases, which discriminate some sectors of the population. 

From the United States, you can find examples where automated decision-making tools and systems are used for remand processes to decide if the imprisoned person have to wait for the trial in prison or not. Analysis of these decision-making processes has shown signs of racial discrimination in the form of black defendants receiving a higher risk level. (Angwin et al., 2016, Corbett-Davies et al., 2017). 

As a solution for algorithm and data biases, the AI Now Institute and the national AI group of France have proposed to adopt algorithmic impact assessments (AI Now, 2018) or discrimination impact assessment (France, 2018). The concept of impact assessment an assessment process and framework which helps policymakers to predict and is better known in the field of environmental decisionmaking (Adelle & Weiland, 2012). However, now impact assessments are becoming a part of the AI ethics toolbox, while also the UK government is proposing data protection impact assessments (the UK, 2018) and IEEE introducing privacy impact assessments (IEEE, 2018). 

AI Now's framework is designed for the public organizations and is based on four key points or steps in the assessment process: 1) defining the system 2) notice the communities and people about the systems and their possible impacts 3) increase the internal capabilities in public organizations to assess fairness, justice, due process and disparate impact 4) give access to researchers and auditors to the systems once they have deployed (AI Now, 2018). 

Our understanding of algorithm and data biases is growing and developing. The topic is under research, and for example, the problem had considered at the last NIPS (Neural Information Processing Systems) conference.

Added sources and references:

Blodgett, S. L., Green, L., & O'Connor, B. (2016). Demographic dialectal variation in social media: A case study of African-American English. arXiv preprint arXiv:1608.08868. https://arxiv.org/abs/1608.08868

Buolamwini, J., & Gebru, T. (2018, January). Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on Fairness, Accountability and Transparency (pp. 77-91). http://proceedings.mlr.press/v81/buolamwini18a/buolamwini18a.pdf

Evans, Benedict: https://www.ben-evans.com/benedictevans/2019/4/15/notes-on-ai-bias

Wilson, B., Hoffman, J., & Morgenstern, J. (2019). Predictive Inequity in Object Detection. arXiv preprint arXiv:1902.11097.https://arxiv.org/abs/1902.11097

Tolan, S., Miron, M., Gómez, E., & Castillo, C. (2019). Why Machine Learning May Lead to Unfairness: Evidence from Risk Assessment for Juvenile Justice in Catalonia. https://www.researchgate.net/publication/333044538_Why_Machine_Learning_May_Lead_to_Unfairness_Evidence_from_Risk_Assessment_for_Juvenile_Justice_in_Catalonia

Knox, D., Lowe, W., & Mummolo, J. (2019). The Bias Is Built In: How Administrative Records Mask Racially Biased Policing. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3336338