2024 Research Review
AI Robustness (AIR)
Modern data analytic methods and tools, including Artificial Intelligence (AI) and Machine Learning (ML) models, depend on correlations; however, such approaches fail to account for confounding in the data, which prevents accurate modeling of cause and effect and often leads to bias. Edge cases, drift in data/concept, and emerging phenomena undermine the significance of correlations relied upon by AI. New test and evaluation methods are therefore needed for ongoing evaluation. Carnegie Mellon University Software Engineering Institute (CMU SEI) has developed a new AI Robustness (AIR) tool that allows users to gauge AI and ML classifier performance with data-based confidence.
The SEI AIR tool offers a precedent-setting capability to improve the correctness of AI classifications and predictions, increasing confidence in the use of AI in development, testing, and operations decision making.
Linda Parker GatesInitiative Lead, Software Acquisition Pathways
The SEI AIR tool offers a precedent-setting capability to improve the correctness of AI classifications and predictions, increasing confidence in the use of AI in development, testing, and operations decision making.
The AIR tool uses state-of-the art algorithms and techniques to:
- Build a causal graph (Step 1) that includes the treatment variable (X), outcome variable (Y), and intermediate variables (M); as well as other variables in the dataset.
- Determine adjustment sets (Step 2) Z1 and Z2, which are selected ancestors, respectively, of X and M, not on any direct path from X into Y; such that conditioning on either will remove the confounding effects that other variables have on X and Y.
- Finally, estimate the causal effect that changing X has on Y (Step 3) by calculating the average risk difference and associated 95% confidence intervals for each adjustment set. Then AIR compares the AI classifier’s predictions to these two confidence intervals to determine whether the classifier is significantly biased and should be retrained.
This project is sponsored and funded by the Office of the Under Secretary of Defense for Research and Engineering OUSD(R&E) to transition the AIR tool to users across the DoD to help identify when/where their AI classifiers are giving biased predictions. We are looking for DoD collaborators to use and provide feedback on our technology. As a participant, your AI and subject-matter experts will work with our team to identify known causal relationships and build out an initial causal graph. Contact us at to learn more.
Figure 1: Steps in the AIR Tool Analysis Process. Results and interpretations given by the AIR tool are based on output from all three steps.
In Context: This FY2024-26 Project
- is a collaborative effort among researchers from the SEI and faculty members at Carnegie Mellon University, building on knowledge gained from previous SEI projects, including the FY22 Maturation of Determining the Limits of AI Robustness (MDLAR) research project
- leverages SEI expertise and experience in necessary data science/programming, mathematical statistics, causal discovery, and causal inference
- aligns with the CMU SEI technical objective to modernize software engineering and acquisition (codify AI engineering practices; improve designed-in trustworthiness) and meets the SEI’s primary enduring challenges of trustworthy, capability, and timeliness
- aligns with the OUSD(R&E) critical technology priority of building trusted AI and autonomy
Principal Investigator
Linda Parker Gates
Initiative Lead, Software Acquisition Pathways
SEI Collaborators
Dr. Michael Konrad
Principal Researcher
Dr. Nicholas Testa
Senior Data Scientist
Melissa Ludwick
Member of the Technical Staff
CMU Collaborators
Dr. Joe Ramsey
Special Faculty and Director of Research Computing
Edward H. Kennedy
Associate Professor, Department of Statistics and Data Science
External Collaborators
Dr. Elias Bareinboim
Associate Professor of Computer Science and Director of the Causal Artificial Intelligence Lab
Columbia University
Have a Question?
Reach out to us at info@sei.cmu.edu.
Mentioned in this Article
Can You Rely on Your AI? Apply - Software Engineering Institute (SEI) Webcast Series - Apple Podcasts: https://podcasts.apple.com/us/podcast/can-you-rely-on-your-ai-applying-the-air-tool-to/id924045987?i=1000657439121
Measuring AI Accuracy with the AI Robustness (AIR) Tool Blog: https://insights.sei.cmu.edu/blog/measuring-ai-accuracy-with-the-ai-robustness-air-tool/
Center for Causal Discovery Summer Short Course: https://www.youtube.com/watch?v=9yEYZURoE3Y.
J. Pearl, M. Glymour, N. Jewell, “Causal Inference in Statistics: A Primer.” Wiley. ISBN: 978-1-119-18684-7. March 2016.
Hoffman, K. (2020). An Illustrated Guide to TMLE, Part I: Introduction and Motivation. https://www.khstats.com/blog/tmle/tutorial