Finding vulnerabilities in code is part of the constant security game between attackers and defenders. An attacker only needs to find one opening to be successful, while a defender needs to search for and plug all or at least most of the holes in a system. Thus, a defender needs more effective tools than the attacker to come out ahead.
A common technique used to find vulnerabilities is fuzzing, where random (based on some generator) inputs are sent to a system to expose mistakes in code. However, blind fuzzing is not necessarily very efficient at finding issues – without some idea of where to search, most tests will be wasted effort. The defender has an advantage here – with the source code for and background information about the software under attack, fuzzing tests can be guided by static analysis and symbolic execution of the code under investigation. Crashes and bugs can be mapped back to the source code for easier fixing.
The Excite project at Intel is using a combination of of symbolic execution, fuzzing, and concrete testing to achieve just that – finding vulnerabilities in sensitive code. By combining symbolic and concrete techniques, Excite achieves better performance and effect than using either technique alone. Excite is a powerful tool for excavating BIOS security vulnerabilities in an automated fashion. It has been in development since 2015, and there have been some public presentations about the project such as USENIX Workshop on Offensive Technologies (WOOT) 2015 and ZeroNights 2016.
Excite combines a dynamic selective symbolic execution and guided fuzzing for test case generation. It uses the Wind River* Simics* virtual platform to dump platform-dependent data and code, and to replay tests while checking for security issues and measuring coverage to guide the next set of tests. Excite operates at the intersection of three technologies: Symbolic Execution, Fuzzing, and Virtual Platforms:
Target: System Management Mode
The current target for the Excite project is analysis of the System Management Interrupt (SMI) handlers in System Management Mode (SMM), as implemented in Unified Extensible Firmware Interface (UEFI) BIOS. Attacks on BIOS have increased in recent years, and Intel is stepping up BIOS security with coding guidelines, secure design guidelines, code reviews, and static code analysis. Excite is one more technology for securing the BIOS, by automatically generating tests for bug hunting.
SMM is the most privileged state of execution in an Intel processor (considered as “Ring -2”, where the OS is at Ring 0 and user applications at Ring 3), and as such it is a perfect place for a rootkit to place itself. The operating system of the machine does not know when SMM is running, and cannot detect or prevent the execution of SMM code. Thus, securing SMM is critical to the security of the platform overall.
The code and data used by SMM is stored in System Management RAM (SMRAM). SMRAM is a part of the system RAM that is dedicated for SMM usage, protected by mechanisms in the processor. SMM is entered via System Management Interrupt (SMI) calls, triggered by platform-specific events.
When an SMI happens, a Communications Buffer (comm buffer) is used to pass parameters from the outside. The comm buffer is stored in regular RAM and must be assumed to be potentially under the control of adversaries. Thus, SMI handlers must be very careful to check and validate the information in a comm buffer in order not to be misled into doing things that would help an attacker.
SMI handlers can access any memory in the machine, and as such, they have great power to cause trouble. A UEFI BIOS sets up tables that define the memory that SMI handlers should access, and the memory they should not.
SMM is part of a UEFI BIOS, and as such, it is not a static component. Rather, during the boot, the BIOS will dynamically load SMM drivers and their associated SMI handlers into SMRAM. Once setup is complete, SMRAM is locked by setting lock bits in the processor.
Applying Excite to SMM
The current testing in Excite aims to catch two particularly nefarious types of issues in UEFI SMI handlers: calls outside of SMRAM and access out of allowable memory regions. To do this, the Excite tool set is a combination of several tools and techniques, tied together into a flow that looks like this:
The UEFI BIOS build is standard; no special build flags or variants are needed to enable the use of Excite. Once the BIOS has been built, it is loaded into a Simics virtual platform and booted. By using Simics, UEFI code for real platforms can be used, as Simics can simulate real Intel platforms (like the Minnowboard).
Just after the SMM drivers have initialized but before SMRAM is locked, Simics dumps an image of SMRAM to be used in the symbolic execution. The advantage of this approach is that the dump contains the initialized state of SMM modules which eliminates the need to develop a complex model of SMM as would be required if the C source code was used.
The next step is to generate test harnesses. In this stage, Excite scans SMRAM to find all the module registrations and all the handlers. For each handler, a test harness used to invoke the CRETE symbolic execution engine is created. A test harness maps the SMRAM into the application space where CRETE works. CRETE works on the binary directly – no knowledge of source is needed. The entire comm buffer used as input for the SMI is marked as symbolic, and serves as the starting point for the symbolic execution.
CRETE will explore the behavior of each SMI handler and generate test cases for each path it finds. It can easily provide tens of thousands of test cases for a single handler. Each generated test case is a concrete set of data for the comm buffer contents.
The generated test cases are then executed on Simics. As part of running the test cases, code coverage is collected, and any illegal memory accesses or calls are detected.
Symbolic Analysis and Test Generation
Symbolic execution is a powerful technique to systematically explore paths (possibly all) of a software program. Instead of using concrete inputs, symbolic execution executes a program with symbolic inputs. During execution, a symbolic execution engine accumulates a set of constraints on the symbolic inputs. When it encounters branches that depend on symbolic values, it forks two new sets of constraints, one in which the branch condition is true and false in the other one. Upon reaching the end of the program path, the engine sends the constraints to a constraint solver, which generates concrete inputs that will follow this computation path. The process continues until all paths are explored or a termination condition (e.g. timeout) set by the user is reached.
Excite uses CRETE as the symbolic execution engine. CRETE is an open-source project developed by Portland State University. The Excite harness calls the CRETE-provided primitives, such as crete_make_symbolic(var, size, name), to mark function inputs or specific memory region for symbolic execution. Then CRETE will explore the memory snapshots from the entry points, and generate a test case for each program path it explores. The test cases will include concrete values for the inputs to the function.
The exploration in CRETE will sometimes reveal inputs that would cause the SMI handler to crash. These are marked for future explorations, along with any issues found in the execution on Simics.
How Excite Uses Simics Virtual Platforms
The use of a virtual platform is a prerequisite for the Excite flow. It would not be possible to do this using hardware, since we need the ability to inspect SMRAM at a precise point as well as the ability to jump to arbitrary locations in memory in order to run test cases. Hardware does not give us the tools we need.
As discussed already, Simics virtual platforms are used for three purposes in Excite:
- Running through the UEFI setup process to get the contents of SMRAM established. Take a Simics checkpoint to save the entire contents of the target machine memory, registers, and device state.
- Accessing the SMRAM memory once setup is complete, and providing a dump to Excite.
- Running the test cases.
Test cases are executed in Simics by loading the checkpoint saved after the boot to get back to the precise state that the UEFI was in after the boot. Then, the processor state and memory state are set up as specified in the test case. This includes copying the contents of the comm buffer as specified in the test case into memory, setting up pointers and size values in registers (R8 and R9).
The instruction pointer (RIP) of the processor core is set to point at the code entry point to directly jump to the SMI handler. There is no need to issue an SMI interrupt to run the code – the SMI interrupt would just pass through a dispatcher and then end up doing the same thing as the test setup. The assumption is that the SMI dispatcher is reliable, and in this way, tests can be run in a way that is simpler to trace.
As the tests are run on Simics, a custom Simics module known as the Execution Tracer (exect) is active. Exect monitors the execution, looking for SMRAM call outs and accesses to illegal memory regions (such as UEFI boot services memory). Thus, exect will detect bad behavior as soon as it happens and provide a bug report to the UEFI developers.
In addition, exect is used to collect code coverage information. This process is non-intrusive, in that the target system can be observed without any change needed to the source code, as is required by some other approaches. By using code coverage, it is possible to measure how much of the SMI code is being tested and to see if more tests increase coverage.
To further increase coverage, fuzzing techniques similar to those used by the AFL fuzzer are applied to the test cases (AFL itself is not capable of handling UEFI code). The fuzz tests permute and mutate the inputs in the comm buffer of a test case, and then re-runs the test on Simics. Test cases that improve code coverage are kept, while others are discarded.
The reason that fuzzing techniques are able to find more tests to run is that the symbolic execution has some limitations and actually might not generate every possible test case. The symbolic execution also only operates on the comm buffer – there is another state involved that is not part of the comm buffer and thus would not be explored by the symbolic execution.
The picture below provides a simple example of how fuzzing and symbolic execution combine to create better test cases:
Code Coverage Results
If we look at how much of the SMI handler code is being tested, combining symbolic execution and fuzzing provides better coverage than either alone. It is noticeable just how much mode code gets tested by guided testing compared to random black-box fuzzing.
How Issues get Reported
When issues are found, they are concrete – since the actual UEFI code is run as part of the testing. Issues will materialize as errors at particular points in the code. Unlike purely static analysis approaches that can often generate fairly opaque errors, we get issue reports tied to a particular line of code in a particular concrete system state.
A full stack trace of instructions that perform an illegal access or callout is provided to aid in debugging the issue. The assembly code from the SMRAM dump binary file is mapped back to the corresponding C source using the symbols from the build process (i.e. .pdb files) using the ‘dbh’ tool from the Windows Driver Kit (WDK).
With source-code support, the report can point directly at the code where an issue hits, along with the call stack:
Debugging a problem like this in a virtual platform is much easier than doing it on hardware. The virtual platform is not itself restricted by mechanisms like locking RAM – from the virtual platform; you can look into the system without the system being aware. It also means that any malware that happens to be on the system will not see a debugger being used (with some caveats), and thus will easier to investigate. Techniques like replaying executions mean that any observed behavior can be repeated reliably – on any host machine. For more ideas on how virtual platform can be useful in a cyber security setting, see this Wind River whitepaper.
Optimizing the Execution Time with Parallel Testing
Given the large volume of tests, parallel test execution is used to shorten the overall latency to run through all tests. With 20000 test cases for a particular handler, the total time to generate, run on Simics, and do fuzzing add up to more than 10 hours per handler. With 10 SMI handlers to test, total serial test time would be 100 hours or about four days. However, each SMI handler can be analyzed in parallel, and many tests for each handler can be run in parallel too. Using parallel execution, the total test time can be shrunk to 4 hours or a factor of about 20!
The picture below shows the increasing amount of parallelism available as we move through the process:
The Excite project is, pardon the pun, exciting. By combining a set of different tools from different categories into an integrated workflow we get something that none of the tools could achieve on their own. It takes advantage of the unique abilities of virtual platforms to flow data into symbolic execution and generate tests, and then back to the virtual platform to run the tests and investigate their behavior. By adding on fuzzing, the tested space is expanded.
Excite finds tricky bugs in code that is crucial to platform security which strengthens the defensive side of cyber.
Thanks to Lee Rosenbaum and Zhenkun Yang for helping me write this blog post, and kudos to the whole Excite team for having built a very cool piece of technology!