A deeper look at audits¶

Process-oriented Audits¶

The Ethical Matrix¶

Learning Objective: At the end of the session, students should know how to use the Ethical Matrix to audit a system and understand its limitations/strengths illustrated through the HireVue audit report.

HireVue Audit Report, https://www.hirevue.com/resources/template/orcaa-report](https://www.hirevue.com/resources/template/orcaa-report
Cathy O’Neil and Hanna Gunn. 2020. Near-Term Artificial Intelligence and the Ethical Matrix. In Ethics of Artificial Intelligence, S. Matthew Liao (ed.). Oxford University Press, 0. https://doi.org/10.1093/oso/9780190905033.003.0009

ORI Foresight Into AI Ethics¶

Learning Objective: At the end of the session, students should understand the components of the FAIE toolkit, what outcome they can expect (as illustrated in the BE Safety Authority report), what (components/information/people/things) are involved/required to conduct it.

BC Safety Authority - using ORI FAIE Process, https://openroboethics.org//wp-content/uploads/2019/07/generation-r-report-for-technical-safety-bc.pdf
ORI Foresight into AI Ethics toolkit, https://openroboethics.org/wp-content/uploads/2021/07/ORI-Foresight-into-Artificial-Intelligence-Ethics-Launch-V1.pdf](https://openroboethics.org/wp-content/uploads/2021/07/ORI-Foresight-into-Artificial-Intelligence-Ethics-Launch-V1.pdf)

AI Impact Assessment¶

Learning Objective: At the end of the session, students should understand what the algorithmic impact assessmetn tool is, how it relates to the Directive on Automated Decision-making, and be encouraged to reflect on validity, thoughtfulness, and utility of the AIA assessments that have been completed already.

Algorithmic Impact Assessment tool, https://www.canada.ca/en/government/system/digital-government/digital-government-innovations/responsible-use-ai/algorithmic-impact-assessment.htm
Directive on Automated Decision-Making, https://www.tbs-sct.canada.ca/pol/doc-eng.aspx?id=32592
Review & reflect on at least one of the completed AIA assessments, https://search.open.canada.ca/opendata/?collection=aia&page=1&sort=date_modified+desc|

Deepfake Audit¶

Learning Objective: At the end of the session, students should know what FakeFinder is, understand what an audit report on a novel system such as FakeFinder looks like, how different AI ethics assessment frameworks/tools can be mixed and matched.

“AI Assurance Audit of FakeFinder, an Open-Source Deepfake Detection Tool.” iqt Labs, 2021. https://assets.iqt.org/pdfs/IQTLabs_AiA_FakeFinderAudit_DISTRO__1_.pdf/web/viewer.html.
FakeFinder, [https://github.com/IQTLabs/FakeFinder

Algorithmic Audits¶

Dissecting the COMPASS case study¶

Learning Objective: At the end of the session, students should be familiar with the machine bias issues ProPublica raised, how the real human lives have been impacted according to Mattu et al, how the bias investigations have been conducted differently by the ProPublica team vs. NorthPointe team.

J. L. Mattu Julia Angwin,Lauren Kirchner,Surya, ‘How We Analyzed the COMPAS Recidivism Algorithm’, ProPublica. https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm
J. A. Mattu Jeff Larson,Lauren Kirchner,Surya, ‘Machine Bias’, ProPublica. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
Github repo of COMPAS analysis, https://github.com/propublica/compas-analysis
Northpointe's official rebuttal to ProPublica -- Dieterich et al., COMPAS Risk Scales: Demonstrating Accuracy Equity and Predictive Parity, https://go.volarisgroup.com/rs/430-MBX-989/images/ProPublica_Commentary_Final_070616.pdf

Discovering algorithmic bias in existing systems¶

Learning Objective: At the end of the session, students should have an understanding of how bias and discrimination issues can be discovered, illustrated empirically, and what they would need to do to conduct algorithmic fairness investigations for their own projects in the future.

Buolamwini, Joy, and Timnit Gebru. “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification.” In Proceedings of the 1st Conference on Fairness, Accountability and Transparency, 77–91. PMLR, 2018., https://proceedings.mlr.press/v81/buolamwini18a.html
Obermeyer, Ziad, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. “Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations.” Science 366, no. 6464 (October 25, 2019)|447–53. https://doi.org/10.1126/science.aax2342

Fairness metrics and tools¶

Learning Objective: At the end of the session, students should have an understanding of what fairness toolkits exist, where to find them, and what are practical limitaions to using them in the real world.

Exploring How Machine Learning Practitioners (Try To) Use Fairness Toolkits | 2022 ACM Conference on Fairness, Accountability, and Transparency, https://dl.acm.org/doi/10.1145/3531146.3533113
Hasan, Ali, Shea Brown, Jovana Davidovic, Benjamin Lange, and Mitt Regan. “Algorithmic Bias and Risk Assessments: Lessons from Practice.” Digital Society 1, no. 2 (August 19, 2022): 14. https://doi.org/10.1007/s44206-022-00017-z.

Understanding nuanced impact of presumably Fair Models¶

Learning Objective: At the end of the session, students should understand what we miss out when we look at fairness from purely algorithmic/statistical lens -- which is what many other articles during this lecture have done.

Sorelle A. Friedler, Carlos Scheidegger, and Suresh Venkatasubramanian. 2021. The (Im)Possibility of Fairness: Different Value Systems Require Different Mechanisms for Fair Decision Making. Commun. ACM 64, 4 (2021), 136–143., https://dl.acm.org/doi/10.1145/3433949
M. Jorgensen, H. Richert, E. Black, N. Criado, and J. Such, ‘Not So Fair: The Impact of Presumably Fair Machine Learning Models’, in Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, in AIES ’23. New York, NY, USA: Association for Computing Machinery, Aug. 2023, pp. 297–311. doi: [10.1145/3600211.3604699].

Data Quality¶

Learning Objective: At the end of the session, students should have a working knoweldge of how social-minded measures of data quality relates to the algorithmic bias issue, what fairness regulatory standards exist, and whether strategies to augment/change data quality leads to 'fairer' results.

Evaggelia Pitoura. 2020. Social-minded Measures of Data Quality: Fairness, Diversity, and Lack of Bias. Journal of Data and Information Quality 12, 3: 12:1-12:8. https://doi.org/10.1145/3404193
Ioannis Pastaltzidis, Nikolaos Dimitriou, Katherine Quezada-Tavarez, Stergios Aidinlis, Thomas Marquenie, Agata Gurzawska, and Dimitrios Tzovaras. 2022. Data augmentation for fairness-aware machine learning: Preventing algorithmic bias in law enforcement systems. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT '22). Association for Computing Machinery, New York, NY, USA, 2302–2314. https://doi.org/10.1145/3531146.3534644

Dataset Cartography¶

Learning Objective: At the end of the session, students should understand how data quality can be measured, what a dataset cartography is, and how it can be used to assess the quality of training datasets.

S. Swayamdipta et al., ‘Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics’. arXiv, Oct. 15, 2020. doi: 10.48550/arXiv.2009.10795. https://doi.org/10.48550/arXiv.2009.10795
Dataset Cartography Github Repo, https://github.com/allenai/cartography

Empirical evaluation of data quality¶

Learning Objective: At the end of the session, students should understand components of data quality metrics, how data quality can be affected by data collection, what kind of augmentation techniques exist, and how benchmark datasets are used to empirically investigate dataset quality.

Leo L. Pipino, Yang W. Lee, and Richard Y. Wang. 2002. Data quality assessment. Commun. ACM 45, 4 (April 2002), 211–218. https://doi.org/10.1145/505248.506010
Rachel Hong, Tadayoshi Kohno, and Jamie Morgenstern. 2023. Evaluation of targeted dataset collection on racial equity in face recognition. In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, 531–541. https://doi.org/10.1145/3600211.3604662

Dataset imbalance problems¶

Learning Objective: At the end of the session, students should understand what dataset imbalance problem is, how it affects ML systems, and whether/how balancing the imbalance solves the problem.

Brownloee, J. A gentle introduction to imbalanced classification, https://machinelearningmastery.com/what-is-imbalanced-classification/
V. Cherepanova et al., ‘A Deep Dive into Dataset Imbalance and Bias in Face Identification’, in Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, in AIES ’23. New York, NY, USA: Association for Computing Machinery, Aug. 2023, pp. 229–247. https://doi.org/10.1145/3600211.3604691

Problems with benchmark datasets¶

Learning Objective: At the end of this session, students should be able to think critically about how bechmark datasets are used in the ML community, and how evaluation of models must match the needs of the real-world deployment.

Tsipras, D., Santurkar, S., Engstrom, L., Ilyas, A., & Madry, A. (2020). From ImageNet to Image Classification: Contextualizing Progress on Benchmarks (arXiv:2005.11295). arXiv. https://doi.org/10.48550/arXiv.2005.11295|https://doi.org/10.48550/arXiv.2005.11295
Su Lin Blodgett, Gilsinia Lopez, Alexandra Olteanu, Robert Sim, and Hanna Wallach. 2021. Stereotyping Norwegian Salmon: An Inventory of Pitfalls in Fairness Benchmark Datasets. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 1004–1015.|https://doi.org/10.18653/v1/2021.acl-long.81