Automated Code Repair

Vast Amounts of Code Have Many Security Vulnerabilities

CERT Division Source Code Analysis Laboratory (SCALe) reviews of software from the U.S. Department of Defense (DoD) and other sources show that most software contains many vulnerabilities. Most security flaws are caused by simple coding errors. Static analysis tools, typically used late in the development process, produce a huge number of diagnostics. Even after excluding false positives, the volume of true positives can overwhelm the abilities of development teams to fix the code. Consequently, the team eliminates only a small percentage of the vulnerabilities. Meanwhile, the existing installed codebases in the DoD now consist of billions of lines of C code that contain an unknown number of security vulnerabilities.

Most analyzers provide basic diagnostics but do not provide automated fixes or code modifications. Integrated development environments (IDEs), such as Eclipse, offer some automated code modification. Some IDEs fix code that has specific compilation errors, such as Quick Fixes in Eclipse. While IDEs provide some refactoring options, they are not intended to change the behavior of the code; instead they improve some aspect of the design.

Existing techniques for addressing security problems in code often require programmers to add more information—such as annotations and attributes—that can then be post-processed. These techniques are effective when developing new code, but they have the same practical limitations that manually address thousands of diagnostics in existing programs. We need a better way to fix existing code.

Our Solution: Automated Tools Look for Vulnerabilities and Fix Them

Our experience examining code shows that many security-relevant bugs follow common patterns that tools can automatically detect. There are corresponding patterns for repairing these bugs that tools can perform using automatic program transformation. We are developing automated source-code transformation tools to remediate vulnerabilities in code that are caused by violations of rules in the CERT Secure Coding Standards.

These tools convert noncompliant code into code that complies with the CERT standards. They reduce vulnerabilities without the need for developers to manually review thousands of diagnostics produced by static analysis tools. Sometimes our tools repair a bug completely automatically. In other cases, it prompts developers for more information when a little manual intervention can result in an effective repair.

We based our automated repair work on three premises:

Many security bugs follow common patterns.
By recognizing a pattern, a tool can make a reasonable guess about the developer's intention. We call this the inferred specification.
A tool can repair the code to satisfy the inferred specification.

For example, malloc is a function that allocates a chunk of memory and returns a pointer to it. One common pattern of security bugs is a memory allocation such as “p = malloc(n * sizeof(T)),” where n is attacker-controlled. If n is too large, integer overflow occurs, and too little memory gets allocated, setting the stage for a buffer overflow. The inferred specification in the malloc case would be “Try to allocate enough memory to hold n objects of type T.” The tool inserts code to check whether overflow occurs and to simulate malloc returning NULL due to insufficient memory if overflow does occur.

To develop our automated code repair tool, we extended Rose, a framework for source code transformation. Our goal is to reduce the number of rule violations that require manual inspection by two orders of magnitude—from thousands to tens. At this scope, a development team can mitigate all unhandled violations. Automated code repair reduces a system’s attack surface and improves its ability to withstand cyber attacks while sustaining critical functions.

Software and Tools

SCALe Collection

August 2018

The CERT Division's Source Code Analysis Laboratory (SCALe) offers conformance testing of C and Java language software systems against the CERT C Secure Coding Standard and the CERT Oracle Secure Coding Standard for...

read

Source Code Analysis Laboratory (SCALe)

March 2012

In this report, the authors describe the CERT Program's Source Code Analysis Laboratory (SCALe), a conformance test against secure coding...

read

Software Engineering Institute