Skip to content

A deeper look at audits

Process-oriented Audits

The Ethical Matrix

Learning Objective: At the end of the session, students should know how to use the Ethical Matrix to audit a system and understand its limitations/strengths illustrated through the HireVue audit report.

  • HireVue Audit Report, https://www.hirevue.com/resources/template/orcaa-report](https://www.hirevue.com/resources/template/orcaa-report

  • Cathy O’Neil and Hanna Gunn. 2020. Near-Term Artificial Intelligence and the Ethical Matrix. In Ethics of Artificial Intelligence, S. Matthew Liao (ed.). Oxford University Press, 0. https://doi.org/10.1093/oso/9780190905033.003.0009

ORI Foresight Into AI Ethics

Learning Objective: At the end of the session, students should understand the components of the FAIE toolkit, what outcome they can expect (as illustrated in the BE Safety Authority report), what (components/information/people/things) are involved/required to conduct it.

  • BC Safety Authority - using ORI FAIE Process, https://openroboethics.org//wp-content/uploads/2019/07/generation-r-report-for-technical-safety-bc.pdf

  • ORI Foresight into AI Ethics toolkit, https://openroboethics.org/wp-content/uploads/2021/07/ORI-Foresight-into-Artificial-Intelligence-Ethics-Launch-V1.pdf](https://openroboethics.org/wp-content/uploads/2021/07/ORI-Foresight-into-Artificial-Intelligence-Ethics-Launch-V1.pdf)

AI Impact Assessment

Learning Objective: At the end of the session, students should understand what the algorithmic impact assessmetn tool is, how it relates to the Directive on Automated Decision-making, and be encouraged to reflect on validity, thoughtfulness, and utility of the AIA assessments that have been completed already.

Deepfake Audit

Learning Objective: At the end of the session, students should know what FakeFinder is, understand what an audit report on a novel system such as FakeFinder looks like, how different AI ethics assessment frameworks/tools can be mixed and matched.

  • “AI Assurance Audit of FakeFinder, an Open-Source Deepfake Detection Tool.” iqt Labs, 2021. https://assets.iqt.org/pdfs/IQTLabs_AiA_FakeFinderAudit_DISTRO__1_.pdf/web/viewer.html.

  • FakeFinder, [https://github.com/IQTLabs/FakeFinder

Algorithmic Audits

Dissecting the COMPASS case study

Learning Objective: At the end of the session, students should be familiar with the machine bias issues ProPublica raised, how the real human lives have been impacted according to Mattu et al, how the bias investigations have been conducted differently by the ProPublica team vs. NorthPointe team.

  • J. L. Mattu Julia Angwin,Lauren Kirchner,Surya, ‘How We Analyzed the COMPAS Recidivism Algorithm’, ProPublica. https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm

  • J. A. Mattu Jeff Larson,Lauren Kirchner,Surya, ‘Machine Bias’, ProPublica. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

  • Github repo of COMPAS analysis, https://github.com/propublica/compas-analysis

  • Northpointe's official rebuttal to ProPublica -- Dieterich et al., COMPAS Risk Scales: Demonstrating Accuracy Equity and Predictive Parity, https://go.volarisgroup.com/rs/430-MBX-989/images/ProPublica_Commentary_Final_070616.pdf

Discovering algorithmic bias in existing systems

Learning Objective: At the end of the session, students should have an understanding of how bias and discrimination issues can be discovered, illustrated empirically, and what they would need to do to conduct algorithmic fairness investigations for their own projects in the future.

  • Buolamwini, Joy, and Timnit Gebru. “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification.” In Proceedings of the 1st Conference on Fairness, Accountability and Transparency, 77–91. PMLR, 2018., https://proceedings.mlr.press/v81/buolamwini18a.html

  • Obermeyer, Ziad, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. “Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations.” Science 366, no. 6464 (October 25, 2019)|447–53. https://doi.org/10.1126/science.aax2342

Fairness metrics and tools

Learning Objective: At the end of the session, students should have an understanding of what fairness toolkits exist, where to find them, and what are practical limitaions to using them in the real world.

  • Exploring How Machine Learning Practitioners (Try To) Use Fairness Toolkits | 2022 ACM Conference on Fairness, Accountability, and Transparency, https://dl.acm.org/doi/10.1145/3531146.3533113

  • Hasan, Ali, Shea Brown, Jovana Davidovic, Benjamin Lange, and Mitt Regan. “Algorithmic Bias and Risk Assessments: Lessons from Practice.” Digital Society 1, no. 2 (August 19, 2022): 14. https://doi.org/10.1007/s44206-022-00017-z.

Understanding nuanced impact of presumably Fair Models

Learning Objective: At the end of the session, students should understand what we miss out when we look at fairness from purely algorithmic/statistical lens -- which is what many other articles during this lecture have done.

  • Sorelle A. Friedler, Carlos Scheidegger, and Suresh Venkatasubramanian. 2021. The (Im)Possibility of Fairness: Different Value Systems Require Different Mechanisms for Fair Decision Making. Commun. ACM 64, 4 (2021), 136–143., https://dl.acm.org/doi/10.1145/3433949

  • M. Jorgensen, H. Richert, E. Black, N. Criado, and J. Such, ‘Not So Fair: The Impact of Presumably Fair Machine Learning Models’, in Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, in AIES ’23. New York, NY, USA: Association for Computing Machinery, Aug. 2023, pp. 297–311. doi: [10.1145/3600211.3604699].

Data Quality

Learning Objective: At the end of the session, students should have a working knoweldge of how social-minded measures of data quality relates to the algorithmic bias issue, what fairness regulatory standards exist, and whether strategies to augment/change data quality leads to 'fairer' results.

  • Evaggelia Pitoura. 2020. Social-minded Measures of Data Quality: Fairness, Diversity, and Lack of Bias. Journal of Data and Information Quality 12, 3: 12:1-12:8. https://doi.org/10.1145/3404193

  • Ioannis Pastaltzidis, Nikolaos Dimitriou, Katherine Quezada-Tavarez, Stergios Aidinlis, Thomas Marquenie, Agata Gurzawska, and Dimitrios Tzovaras. 2022. Data augmentation for fairness-aware machine learning: Preventing algorithmic bias in law enforcement systems. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT '22). Association for Computing Machinery, New York, NY, USA, 2302–2314. https://doi.org/10.1145/3531146.3534644

Dataset Cartography

Learning Objective: At the end of the session, students should understand how data quality can be measured, what a dataset cartography is, and how it can be used to assess the quality of training datasets.

  • S. Swayamdipta et al., ‘Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics’. arXiv, Oct. 15, 2020. doi: 10.48550/arXiv.2009.10795. https://doi.org/10.48550/arXiv.2009.10795

  • Dataset Cartography Github Repo, https://github.com/allenai/cartography

Empirical evaluation of data quality

Learning Objective: At the end of the session, students should understand components of data quality metrics, how data quality can be affected by data collection, what kind of augmentation techniques exist, and how benchmark datasets are used to empirically investigate dataset quality.

  • Leo L. Pipino, Yang W. Lee, and Richard Y. Wang. 2002. Data quality assessment. Commun. ACM 45, 4 (April 2002), 211–218. https://doi.org/10.1145/505248.506010

  • Rachel Hong, Tadayoshi Kohno, and Jamie Morgenstern. 2023. Evaluation of targeted dataset collection on racial equity in face recognition. In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, 531–541. https://doi.org/10.1145/3600211.3604662

Dataset imbalance problems

Learning Objective: At the end of the session, students should understand what dataset imbalance problem is, how it affects ML systems, and whether/how balancing the imbalance solves the problem.

  • Brownloee, J. A gentle introduction to imbalanced classification, https://machinelearningmastery.com/what-is-imbalanced-classification/

  • V. Cherepanova et al., ‘A Deep Dive into Dataset Imbalance and Bias in Face Identification’, in Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, in AIES ’23. New York, NY, USA: Association for Computing Machinery, Aug. 2023, pp. 229–247. https://doi.org/10.1145/3600211.3604691

Problems with benchmark datasets

Learning Objective: At the end of this  session, students should be able to think critically about how bechmark datasets are used in the ML community, and how evaluation of models must match the needs of the real-world deployment.

  • Tsipras, D., Santurkar, S., Engstrom, L., Ilyas, A., & Madry, A. (2020). From ImageNet to Image Classification: Contextualizing Progress on Benchmarks (arXiv:2005.11295). arXiv. https://doi.org/10.48550/arXiv.2005.11295|https://doi.org/10.48550/arXiv.2005.11295
  • Su Lin Blodgett, Gilsinia Lopez, Alexandra Olteanu, Robert Sim, and Hanna Wallach. 2021. Stereotyping Norwegian Salmon: An Inventory of Pitfalls in Fairness Benchmark Datasets. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 1004–1015.|https://doi.org/10.18653/v1/2021.acl-long.81