Collaboration Conversation on Safety Analysis and Fault Detection Isolation and Recovery Synthesis (SAFIR)

The operational complexity of cyber-physical systems (CPS) forces new autonomous features into day-to-day systems, such as vehicles and factories, a phenomenon termed Increasingly Autonomous CPS systems (IA-CPS) [Alves 2018]. IA-CPS have a complex architecture that weaves hardware, AI-enabled functions or decision-making processes, human operators, and software. They are time sensitive and substitute human actions with high-frequency real-time algorithms. In such systems, the conjunctions of faults and their timed propagation can cause fatal incidents, such as those involving autonomous cars. In these particular cases, the safety mechanisms were either too inefficient to prevent a fault or actually caused the incident.

This situation creates concerns for future DoD programs: These systems not only need to be able to detect failures and recover once, but they also need to be able to reconfigure multiple times—autonomously—as they adapt to different situations without human intervention.

The DoD’s AI vision requires advances in safety analysis, and fault detection isolation and recovery synthesis (or SAFIR) to (1) model and analyze dynamic reconfiguration and fault propagation due to fault sequences, and (2) enforce safe reconfiguration. For these two concerns, SAFIR will improve architecture-led safety assessment processes by delivering new tool-supported analysis and code generation capabilities to designers.

More specifically, we will

improve the state of practice in safety engineering in a model-based software engineering (MBSE) context by considering timing propagation of failures in Architecture Analysis & Design Language (AADL) based architectural description and improving AADL reconfiguration mechanisms to align with Dynamic Fault Tree (DFT) operators, and deliver an implementation of these operators
apply DFT analysis to evaluate the effectiveness of existing FDIR policies, synthesize FDIR policies by processing DFT simulation traces, and enrich architectural descriptions with specific fault detection and reconfiguration mechanisms

SAFIR addresses safety analysis of time-sensitive CPS in both its theoretical and practical dimensions, and it contributes to the SEI's line of research on artificial intelligence and autonomy. At the end of the first year, SAFIR has established the theoretical foundation to perform safety evaluations in the context of time-dependent failure conditions.

In Context

This FY2021-23 Project

Builds on SEI expertise in MBSE, safety analysis and the AADL language. It extends past contributions from Integrated Safety and Security Engineering (ISSE) and TwinOps
aligns with the CMU SEI technical objective to bring capabilities through software that make new missions possible or improve the likelihood of success of existing ones and to be trustworthy in construction and implementations
also aligns with the CMU SEI technical objective to be resilient in the face of operational uncertainties, including known and yet-unseen adversary capabilities

Mentioned in this Article

[Alves 2018]

Alves, E. E.; Devesh, B.; Hall, B.; Driscoll, K.; Murugesan, A.; & Rushby, J. Considerations in Assuring Safety of Increasingly Autonomous Systems. Technical Report NASA/CR-2018-220080, NF1676L-30426, Nasa Air Transportation And Safety. 2018.

Research Review 2021