System Software Debug with JTAG/XDP and Event Trace

The complexity of System-On-Chip based designs used in Intelligent Systems is growing fast with platforms comprised of multiple different cores with the software stack interacting across these cores. This translates directly to a more complex software stack, that has its own challenges for guaranteeing consistent reliability throughout the devices life cycle. Thus it is ever more important to have system and application debug solutions that provide deep hardware and platform insight as well as visibility of the entire system software stack. From EFI based firmware, boot loaders, and the OS kernel all the way to device drivers and applications, the debug solutions of the Intel® System Studio provide this level of coverage.  This includes full source level language debug capabilities combined with insight into the target device register set and hardware status.  Interactions between the different software components are often timing sensitive. When trying to debug a code base with many interactions between components single-stepping through one specific component is usually not conclusive for identifying an issue on. Traditional printf debugging is also not effective in this context because the debugging changes can adversely affect timing behavior and cause even worse problems (also known as “Heisenbugs”).This article goes into how the Intel® System Debugger in conjunction with low-overhead instrumentation based event tracing can be used to identify and resolve the most vexing runtime issues, whether they are deterministic or not.

Key features include:

  • Full Eclipse* Rich Client Platform (RCP). Host environment for Linux* and Windows*
  • Joint Test Action Group (JTAG) IEEE 1149.1 debugging with visualization of device registers and memory allocation
  • OS-, firmware-, and driver-aware debugging
  • Debugger integrated flashing of binary and hex images to NOR and NAND memory  (Fig. 1)
  • Sophisticated instruction tracing and event tracking
  • Low-latency instrumentation that can identify non-deterministic issues, even within production code

 

Fig. 1: Debugger integrated flashing of NAND and NOR memory on Intel® Atom™ Processor CE5300

The Intel® System Studio as a whole supports a wide variety of Linux* OS hosts. For system software debug  with the Intel® System Debugger Microsoft* Windows* hosts are also supported, but the  Intel® System Debugger included in the Intel® System Studio currently only supports Intel® Atom™ processors.  For the purposes of this article we assume a host system based on a standard Linux* distribution and a target system running Wind River* Linux* or Yocto Project* on an Intel Atom processor.  Every layer of such an embedded software stack requires a slightly different approach to debugging. Let us look at each software layer, typical challenges and debug approaches from the firmware all the way app to the application layer.

Firmware, Bootloaders, and the OS


Fig. 2: Intel® System Debugger

For early hardware configuration and board bring-up, Intel® System Studio supports JTAG debugging through an Eclipse* RCP based user interface. This interface offers unique features like:

  • A bitfield editor for the registers that manage device control, processor status, and software status. This editor explains the function of each bit.
  • Page table visualization, which shows how virtual addresses map to physical memory.
  • Access to the General Descriptor Table (GDT) which describes the executability and writability of memory segments, and the Local Descriptor Table (LDT), which reserves memory segments for specific programs.
  • OS memory management and configuration awareness, to identify faulty data allocation.


Fig. 3: Intel® System Debugger Memory Layout Awareness

In addition to enabling low-level debugging, the JTAG capabilities are invaluable for high-level trouble-shooting. For example, the ability to inspect the GDT and LDT combined with page table visualization makes it easy to identify the nature of a failed or incorrect memory access, such as a stack overflow. Indeed, the tool’s support for both in-depth hardware and software awareness is key to resolving many issues.

Figure 4 shows how the same bitfield editor view and detailed insight used for memory configuration also applies to the Interrupt Descriptor table, very useful to identify the root cause for a particular OS signal.

Fig. 4: Intel® System Debugger Bitfield Editor and Interrupt Descriptor Table View

Intel System Studio supports JTAG debugging with probes including the Macraigor Systems*’ usb2Demon* device and the Intel® ITP-XDP3 JTAG. The usb2Demon* device is a good choice for developers seeking a low-cost yet comprehensive debug for systems based on Intel® Atom™ processors. It can serve them through board test and bring up, initialization, all the way through application debug and production line test.

Unified Extensible Firmware Interface (UEFI)

Developers often face particularly difficultly debugging the Unified Extensible Firmware Interface (UEFI). The UEFI is the interface between the OS and firmware—in essence, a modern version of the Basic Input/Output System (BIOS).

The UEFI environment uses relocatable code modules, and the addresses of these modules are usually not known to the end users. Intel System Studio solves this problem with symbol-aware debugging that identifies the location of code modules and allows UEFI debugging immediately from reset. It provides two general methods for locating code modules in memory:

  • List all modules known by the UEFI runtime and allow the user to load symbols for a specific module.
  • Identify the module located at a certain memory address (e.g., at the instruction pointer)

Device Drivers

Device drivers are another major challenge for debugging because these drivers are often timing-sensitive. Thus, adding instrumentation to driver code can change its behavior. To address this challenge, Intel System Studio provides an OS- and driver-aware kernel module to be loaded on the target device for instrumentation-free debugging. At device driver load time, this kernel module exports the memory location of the driver’s initialization and destruction methods to the host via the JTAG interface. (Fig. 5)It is thus possible to load the symbol info for the device driver, step into it and debug its execution flow without modifying or instrumenting any of its code. This avoids the risk of changing timing behavior of the driver code.


Fig. 5 Dynamically Loaded Kernel Module Debug and OS Awareness

In addition, the bitfield editor can access public device registers, permitting monitoring of device configuration register entries during device driver debug.

Instruction Tracing

The ability to track errors back to their source is the essence of debugging. Intel System Studio provides advanced instruction tracing to unroll execution flow and identify the root causes of runtime issue. Specifically, the tool inspects the Last Branch Records (LBR) and disassembles the code to recreate program flow. It then pairs the assembly instructions with the associated source code (obtained from the ELF Dwarf executable in the case of embedded Linux*), and displays the resulting trace GUI in the debugger interface.


Fig. 6: Last Branch Record Instruction Trace

 

Tracing does not impede real-time performance, and is therefore a powerful tool for tracking deterministic and repeatable errors such as stack overflow or segmentation fault. It can be used as follows:

  1. Set breakpoint in OS signal event handler (e.g., break on segmentation fault)
  2. Unroll execution flow leading up to event
  3. Follow execution backwards to where it deviated from expectation
  4. Rerun to that point and analyze memory accesses

Non-Deterministic and Hard-to-Replicate Issues

Modern intelligent systems are increasingly timing-sensitive, especially when they are heavily threaded or rely on message- and data-passing events between software modules. These systems often encounter non-deterministic issues that are difficult to reproduce.

Debugging these systems can be tricky. Debugging code can impact the timing of the software stack, altering application behavior and making issues disappear during a debug session. (These are the so called Heisenbugs.) The problem is particularly severe for issues that only appear when the device is deployed in the field, where the device may be inaccessible.

The solution to this is the Software Visible Event Nexus (SVEN) Software Development Kit (SDK) Technology Preview. SVEN relies on a static code instrumentation and a small DRAM buffer with <5 µs timing overhead, minimizing opportunities for Heisenbugs (Fig. 7). SVEN enables developers to identify timing-dependent runtime issues that defy traditional methods.  The instrumentation code can stay in production code and only impact execution when logging is active Thus, SVEN is well-suited for offline debug of applications deployed in hard-to-access locations.


Fig. 7 SVEN SDK architecture

 

Originally developed for Intel® Atom™ processor CExxxx based platforms, SVEN is a field-proven technology that is now available across Intel® architecture. The tool is designed for today’s complex systems, and can trace asynchronous message and data event propagation throughout a chipset. SVEN is highly configurable, and can be used to instrument system software as well as applications. All that is need is a reliable clock signal to correlate event timing (Fig. 8).


Fig. 8 SVEN Trace Viewer

The Intel® System Studio provides the SVEN framework in form of an open source SDK as well as a graphical trace viewer for easy navigation of events and timing for quick identification of irregularities. Furthermore, JTAG support introduces data breakpoints that allow system-level debugging. This capability allows triggering on any SVEN event and stepping and debugging from a suspicious event.


Fig. 9 SVEN Event JTAG Debug Triggers

Summary

The Intel® System Studio and its Intel® System Debugger as well as SVEN SDK give you the tools needed to identify and resolve even the most vexing and hard to track down runtime issues, especially on the complex software stacks spanning multiple System-On-Chip IP blocks. 

The attached documents provide additional insight into the usage and feature sets of these powerful debug tools

For more complete information about compiler optimizations, see our Optimization Notice.