<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>SEI Blog | Artificial Intelligence Engineering</title><link href="http://sei.cmu.edu/feeds/topic/artificial-intelligence-engineering/atom/?utm_source=blog&amp;utm_medium=rss" rel="alternate"/><link href="http://sei.cmu.edu/feeds/topic/artificial-intelligence-engineering/atom/?utm_source=blog&amp;utm_medium=rss" rel="self"/><id>http://sei.cmu.edu/feeds/topic/artificial-intelligence-engineering/atom/?utm_source=blog&amp;utm_medium=rss</id><updated>2025-09-09T00:00:00-04:00</updated><subtitle>Updates on changes and additions to the                         SEI Blog for posts matching Artificial Intelligence Engineering</subtitle><entry><title>My AI System Works…But Is It Safe to Use?</title><link href="https://www.sei.cmu.edu/blog/my-ai-system-worksbut-is-it-safe-to-use/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2025-09-09T00:00:00-04:00</published><updated>2025-09-09T00:00:00-04:00</updated><author><name>David Schulker, Matt Walsh, Emil Mathew</name></author><id>https://www.sei.cmu.edu/blog/my-ai-system-worksbut-is-it-safe-to-use/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">This blog post introduce System Theoretic Process Analysis (STPA), a hazard analysis technique uniquely suitable for dealing with the complexity of AI systems.</summary></entry><entry><title>Artificial Intelligence in National Security: Acquisition and Integration</title><link href="https://www.sei.cmu.edu/blog/artificial-intelligence-in-national-security-acquisition-and-integration/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2025-08-05T00:00:00-04:00</published><updated>2025-08-05T00:00:00-04:00</updated><author><name>Paige Rishel, Carol Smith, Brigid O'Hearn, Rita Creel</name></author><id>https://www.sei.cmu.edu/blog/artificial-intelligence-in-national-security-acquisition-and-integration/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">This blog post highlights practitioner insights from a recent AI Acquisition workshop, including challenges in differentiating AI systems, guidance on when to use AI, and matching AI tools to mission needs.</summary></entry><entry><title>Amplifying AI Readiness in the DoD Workforce</title><link href="https://www.sei.cmu.edu/blog/amplifying-ai-readiness-in-the-dod-workforce/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2025-06-23T00:00:00-04:00</published><updated>2025-06-23T00:00:00-04:00</updated><author><name>Eric Keylor, Robert Beveridge, Jonathan Frederick</name></author><id>https://www.sei.cmu.edu/blog/amplifying-ai-readiness-in-the-dod-workforce/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">The SEI recently partnered with the Department of the Air Force Chief Data and AI Office to develop a strategy to identify and assess hidden workforce talent for data and AI work roles.</summary></entry><entry><title>Out of Distribution Detection: Knowing When AI Doesn't Know</title><link href="https://www.sei.cmu.edu/blog/out-of-distribution-detection-knowing-when-ai-doesnt-know/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2025-06-09T00:00:00-04:00</published><updated>2025-06-09T00:00:00-04:00</updated><author><name>Eric Heim, Cole Frank</name></author><id>https://www.sei.cmu.edu/blog/out-of-distribution-detection-knowing-when-ai-doesnt-know/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">How do we know when an AI system is operating outside its intended knowledge boundaries?</summary></entry><entry><title>10 Things Organizations Should Know About AI Workforce Development</title><link href="https://www.sei.cmu.edu/blog/10-things-organizations-should-know-about-ai-workforce-development/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2025-04-28T00:00:00-04:00</published><updated>2025-04-28T00:00:00-04:00</updated><author><name>Jonathan Frederick, Dominic Ross, Eric Keylor, Cole Frank, Intae Nam</name></author><id>https://www.sei.cmu.edu/blog/10-things-organizations-should-know-about-ai-workforce-development/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">This post outlines 10 recommendations developed in response to work with our mission partners in the Department of Defense.</summary></entry><entry><title>DataOps: Towards More Reliable Machine Learning Systems</title><link href="https://www.sei.cmu.edu/blog/dataops-towards-more-reliable-machine-learning-systems/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2025-04-21T00:00:00-04:00</published><updated>2025-04-21T00:00:00-04:00</updated><author><name>Daniel DeCapria</name></author><id>https://www.sei.cmu.edu/blog/dataops-towards-more-reliable-machine-learning-systems/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">Decisions based on ML models can have significant consequences, and managing the raw material—data—in ML systems is a challenge. This post explains DataOps, an area that focuses on the management and optimization of data throughout its lifecycle.</summary><category term="Artificial Intelligence Engineering"/><category term="Machine Learning"/></entry><entry><title>Evaluating LLMs for Text Summarization: An Introduction</title><link href="https://www.sei.cmu.edu/blog/evaluating-llms-for-text-summarization-introduction/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2025-04-07T00:00:00-04:00</published><updated>2025-04-07T00:00:00-04:00</updated><author><name>Shannon Gallagher, Swati Rallapalli, Tyler Brooks</name></author><id>https://www.sei.cmu.edu/blog/evaluating-llms-for-text-summarization-introduction/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">Deploying LLMs without human supervision and evaluation can lead to significant errors. This post outlines the fundamentals of LLM evaluation for text summarization in high-stakes applications.</summary><category term="Machine Learning"/></entry><entry><title>The Essential Role of AISIRT in Flaw and Vulnerability Management</title><link href="https://www.sei.cmu.edu/blog/the-essential-role-of-aisirt-in-flaw-and-vulnerability-management/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2025-03-26T00:00:00-04:00</published><updated>2025-03-26T00:00:00-04:00</updated><author><name>Lauren McIlvenny, Vijay Sarvepalli</name></author><id>https://www.sei.cmu.edu/blog/the-essential-role-of-aisirt-in-flaw-and-vulnerability-management/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">The SEI established the first Artificial Intelligence Security Incident Response Team (AISIRT) in 2023. This post discusses the role of AISIRT in coordinating flaws and vulnerabilities in AI systems.</summary><category term="CERT/CC Vulnerabilities"/><category term="Cybersecurity"/><category term="AISIRT"/></entry><entry><title>Enhancing Machine Learning Assurance with Portend</title><link href="https://www.sei.cmu.edu/blog/enhancing-machine-learning-assurance-with-portend/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2025-03-24T00:00:00-04:00</published><updated>2025-03-24T00:00:00-04:00</updated><author><name>Jeffrey Hansen, Sebastián Echeverría, Lena Pons, Gabriel Moreno, Grace Lewis, Lihan Zhan</name></author><id>https://www.sei.cmu.edu/blog/enhancing-machine-learning-assurance-with-portend/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">This post introduces Portend, a new open source toolset that simulates data drift in machine learning models and identifies the proper metrics to detect drift in production environments.</summary><category term="Software Assurance"/><category term="Machine Learning"/></entry><entry><title>Introducing MLTE: A Systems Approach to Machine Learning Test and Evaluation</title><link href="https://www.sei.cmu.edu/blog/introducing-mlte-systems-approach-to-machine-learning-test-and-evaluation/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2025-02-17T00:00:00-05:00</published><updated>2025-02-17T00:00:00-05:00</updated><author><name>Alex Derr, Sebastián Echeverría, Katherine Maffey, Grace Lewis</name></author><id>https://www.sei.cmu.edu/blog/introducing-mlte-systems-approach-to-machine-learning-test-and-evaluation/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">Machine learning systems are notoriously difficult to test. This post introduces Machine Learning Test and Evaluation (MLTE), a new process and tool to mitigate this problem and create safer, more reliable systems.</summary><category term="Testing"/><category term="Machine Learning"/></entry><entry><title>The Myth of Machine Learning Non-Reproducibility and Randomness for Acquisitions and Testing, Evaluation, Verification, and Validation</title><link href="https://www.sei.cmu.edu/blog/the-myth-of-machine-learning-reproducibility-and-randomness-for-acquisitions-and-testing-evaluation-verification-and-validation/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2025-01-13T00:00:00-05:00</published><updated>2025-01-13T00:00:00-05:00</updated><author><name>Andrew Mellinger, Daniel Justice, Marissa Connor, Shannon Gallagher, Tyler Brooks</name></author><id>https://www.sei.cmu.edu/blog/the-myth-of-machine-learning-reproducibility-and-randomness-for-acquisitions-and-testing-evaluation-verification-and-validation/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">A reproducibility challenge faces machine learning (ML) systems today. This post explores  configurations that increase reproducibility and provides recommendations for these challenges.</summary><category term="Acquisition Transformation"/><category term="Testing"/><category term="Machine Learning"/><category term="Verification"/></entry><entry><title>Beyond Capable: Accuracy, Calibration, and Robustness in Large Language Models</title><link href="https://www.sei.cmu.edu/blog/beyond-capable-accuracy-calibration-and-robustness-in-large-language-models/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2024-12-03T00:00:00-05:00</published><updated>2024-12-03T00:00:00-05:00</updated><author><name>Matt Walsh, David Schulker, Shing-hon Lau</name></author><id>https://www.sei.cmu.edu/blog/beyond-capable-accuracy-calibration-and-robustness-in-large-language-models/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">For any organization seeking to responsibly harness the potential of large language models, we present a holistic approach to LLM evaluation that goes beyond accuracy.</summary></entry><entry><title>GenAI for Code Review of C++ and Java</title><link href="https://www.sei.cmu.edu/blog/genai-for-code-review-of-c-and-java/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2024-11-18T00:00:00-05:00</published><updated>2024-11-18T00:00:00-05:00</updated><author><name>David Schulker</name></author><id>https://www.sei.cmu.edu/blog/genai-for-code-review-of-c-and-java/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">Would ChatGPT-3.5 and ChatGPT-4o correctly identify errors in noncompliant code and correctly recognize compliant code as error-free?</summary></entry><entry><title>Introduction to MLOps: Bridging Machine Learning and Operations</title><link href="https://www.sei.cmu.edu/blog/introduction-to-mlops-bridging-machine-learning-and-operations/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2024-11-04T00:00:00-05:00</published><updated>2024-11-04T00:00:00-05:00</updated><author><name>Daniel DeCapria</name></author><id>https://www.sei.cmu.edu/blog/introduction-to-mlops-bridging-machine-learning-and-operations/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">Machine learning operations (MLOps) has emerged as a critical discipline in artificial intelligence and data science. This post introduces MLOps and its applications.</summary><category term="Artificial Intelligence Engineering"/><category term="Machine Learning"/><category term="Edge Computing"/></entry><entry><title>Measuring AI Accuracy with the AI Robustness (AIR) Tool</title><link href="https://www.sei.cmu.edu/blog/measuring-ai-accuracy-with-the-ai-robustness-air-tool/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2024-09-30T00:00:00-04:00</published><updated>2024-09-30T00:00:00-04:00</updated><author><name>Michael Konrad, Nicholas Testa, Linda Parker Gates, Crisanne Nolan, David Shepard, Julie Cohen, Andrew Mellinger, Suzanne Miller, Melissa Ludwick</name></author><id>https://www.sei.cmu.edu/blog/measuring-ai-accuracy-with-the-ai-robustness-air-tool/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">Understanding your artificial intelligence (AI) system’s predictions can be challenging. In this post, SEI researchers discuss a new tool to help improve AI classifier performance.</summary><category term="Machine Learning"/><category term="Artificial Intelligence"/></entry><entry><title>Weaknesses and Vulnerabilities in Modern AI: AI Risk, Cyber Risk, and Planning for Test and Evaluation</title><link href="https://www.sei.cmu.edu/blog/weaknesses-and-vulnerabilities-in-modern-ai-ai-risk-cyber-risk-and-planning-for-test-and-evaluation/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2024-08-12T00:00:00-04:00</published><updated>2024-08-12T00:00:00-04:00</updated><author><name>Bill Scherlis</name></author><id>https://www.sei.cmu.edu/blog/weaknesses-and-vulnerabilities-in-modern-ai-ai-risk-cyber-risk-and-planning-for-test-and-evaluation/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">Modern AI systems pose consequential, poorly understood risks. This blog post explores strategies for framing test and evaluation practices based on a holistic approach to AI risk.</summary></entry><entry><title>Weaknesses and Vulnerabilities in Modern AI: Integrity, Confidentiality, and Governance</title><link href="https://www.sei.cmu.edu/blog/weaknesses-and-vulnerabilities-in-modern-ai-integrity-confidentiality-and-governance/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2024-08-05T00:00:00-04:00</published><updated>2024-08-05T00:00:00-04:00</updated><author><name>Bill Scherlis</name></author><id>https://www.sei.cmu.edu/blog/weaknesses-and-vulnerabilities-in-modern-ai-integrity-confidentiality-and-governance/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">In the rush to develop AI, it is easy to overlook factors that increase risk. This post explores AI risk through the lens of confidentiality, governance, and integrity.</summary></entry><entry><title>Weaknesses and Vulnerabilities in Modern AI: Why Security and Safety Are so Challenging</title><link href="https://www.sei.cmu.edu/blog/weaknesses-and-vulnerabilities-in-modern-ai-why-security-and-safety-are-so-challenging/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2024-07-29T00:00:00-04:00</published><updated>2024-07-29T00:00:00-04:00</updated><author><name>Bill Scherlis</name></author><id>https://www.sei.cmu.edu/blog/weaknesses-and-vulnerabilities-in-modern-ai-why-security-and-safety-are-so-challenging/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">This post explores concepts of security and safety for neural-network-based AI, including ML and generative AI, as well as AI-specific challenges in developing safe and secure systems.</summary></entry><entry><title>Auditing Bias in Large Language Models</title><link href="https://www.sei.cmu.edu/blog/auditing-bias-in-large-language-models/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2024-07-22T00:00:00-04:00</published><updated>2024-07-22T00:00:00-04:00</updated><author><name>Katherine-Marie Robinson, Violet Turri</name></author><id>https://www.sei.cmu.edu/blog/auditing-bias-in-large-language-models/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">This post discusses recent research that uses a role-playing scenario to audit ChatGPT, an approach that opens new possibilities for revealing unwanted biases.</summary><category term="Artificial Intelligence"/></entry><entry><title>Cost-Effective AI Infrastructure: 5 Lessons Learned</title><link href="https://www.sei.cmu.edu/blog/cost-effective-ai-infrastructure-5-lessons-learned/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates" rel="alternate"/><published>2024-05-13T00:00:00-04:00</published><updated>2024-05-13T00:00:00-04:00</updated><author><name>William Nichols, Bryan Brown</name></author><id>https://www.sei.cmu.edu/blog/cost-effective-ai-infrastructure-5-lessons-learned/?utm_source=blog&amp;utm_medium=rss&amp;utm_campaign=my_site_updates</id><summary type="html">This post details challenges and state of the art of cost-effective AI infrastructure and five lessons learned for standing up an LLM.</summary></entry></feed>