Simics® 6 A Deeper Look at How Software Uses Hardware

The new Simics* 6 major release includes many changes to the simulation core and simulator APIs, along with new testing and analysis tools built on top of the new APIs. One such tool is for device register coverage, which makes it easy for developers to investigate and quantify how software uses the programming registers of hardware devices.

 Looking at how Software uses Hardware

Device register coverage information is useful for several different types of users and use cases. Hardware designers can determine which features and capabilities of the hardware are actually being used by software. That can guide future designs and identify room for software optimizations. Design validation engineers can measure and improve the effectiveness of hardware tests. Driver developers can learn how their software really accesses the hardware, just like code coverage reveals which parts of the code is being run. Developers of hardware self-test tools and safety software can measure how well their code actually checks the hardware. 

A Quick Test Set Up

As an illustration of what device coverage tool can discover, I booted up four different software stacks on the Simics Quick-Start Platform (QSP). The QSP is a simple model of a standard PC, and is part of the Simics base product. 

Screenshot showing the graphics console of the Simics target machine with the QSP UEFI splash screen with the Windows 10 spinner superimposed

Figure 1. Windows 10 booting on the Simics QSP

The following software stacks were used in the experiment:

  • Unified Extensible Firmware Interface (UEFI) built-in shell. Note that the activation of the shell is not part of the normal boot flow into an operating system (OS), and thus it counts as an OS stack in its own right for this experiment.
  • Clear Linux* build 28910, which is the current default example Linux shipping on the QSP with Simics (see my blog, Using Clear Linux* for Teaching Virtual Platforms). This used a Linux kernel version 5.0.
  • Ubuntu* 18.04, using a Linux kernel version 4.18. This could mean some differences compared to Clear Linux.
  • Microsoft* Windows* 10 (seen booting in Figure 1).

Each of the software stacks were run for two virtual minutes, enough to get through UEFI and the OS boot. Device register coverage was collected for all devices and all register banks in the system. Collecting this type of information does not slow down the simulation much, since register bank instrumentation is efficient, and hardware accesses are (usually) rare compared to basic instruction execution.

High-Level Results

The coverage results are shown in Figure 2, as a percentage of registers accessed in each device. This combines both read and write accesses, as well as all register banks in a single device. For PCI-based devices, this means that the PCI configuration space is combined with the other banks mapped using PCI Base Address Registers (BARs). Note that there are only 44 devices in the relatively simple, small system. Most modern platform models built in Simics have orders of magnitude more devices, making the QSP a good system to use for illustration purposes. 

Bar graph showing the device register coverage all 44 devices in the system

Figure 2. Device Coverage across all devices, per software stack.

Figure 2 shows some interesting differences and commonalities among the software stacks:

  • Some devices are never accessed. These correspond to hardware that is not used in this particular system configuration, such as the second Serial Advanced Technology Attachment (SATA) controller not being active when there is nothing attached to it. Another example is the remapping unit used when running virtual machines (VMs) using Intel® Virtualization Technology for Directed I/O (VT-d) – only VM stacks will make use of this, and none of the examples used here are VMs.
  • Many devices look the same across software stacks. This often indicates hardware that is handled by the common UEFI boot process, and not so much by the OSs themselves. It can also mean that the device is handled in a similar way by all OS drivers.

A few devices show large variation. One example is the device called “28” which is the built-in Ethernet* controller in the simulated chipset. Here, Windows is very different from the Linux OSs and UEFI. It is also noteworthy that the two Linux variants have different coverage for many devices, which shows that kernel version and software configuration can change how the hardware is used.

Looking Closer at the Serial Port

Coverage only tells part of the story. It is valuable to know whether software uses particular registers in a specific scenario, but a lot more can be learned by looking at precisely which registers are accessed, how, and how often.

The data set presented above includes the four NS16550-style Universal Asynchronous Receiver-Transmitter (UART) devices. Serial output is not as important as it once was, but it is still interesting to see how differently the various software stacks make use of this legacy hardware. In the classic PC architecture, the UARTs are called COM1 to COM4 (COM meaning Communications Port) – starting at one and not at zero.

Bar graph showing the device register coverage for the COM ports on a PC, for the four software stacks in this investigation

Figure 3. COM port device activity per software stack

Figure 3 zooms in on a part of Figure 2 with the device names added. It shows more variety than one might expect. Unlike the other three software stacks, Windows 10 only touches COM1 and COM2. Since the UEFI shell boot accesses all the COM ports, it looks like the EFI Shell accesses more hardware than what is used when UEFI boots into another operating system (Windows and Linux were all booted from the same UEFI BIOS). The two Linux distributions also differ, and it is worth zooming in more to see what is going on there.

Looking at the total number of accesses to each register for each software stack reveals more information. Figure 4 shows the most interesting data points. 

Tables showing the number of accesses to each register in COM1 and COM3, for Windows 10, Ubuntu 18, and Clear Linux

Figure 4. COM Port register access counts

Clear Linux is the only OS set up that uses the COM port for input and output, as can be seen by the much higher access counts to all the registers for COM1 and Clear Linux.  Note that even when a Linux configuration does not use a COM port, there are quite a few accesses happening. That can be seen for COM3 for Clear Linux, COM3 for Ubuntu, and COM1 for Ubuntu. In contrast, Windows does not issue a single access to COM3, and when not using COM1, it issues a very different set of accesses compared to the Linux distributions. This shows that the Windows serial port driver is very different from the Linux driver. This neatly illustrates two obvious but important observations:

  • Different drivers use the same hardware in different ways (Windows vs. Linux)
  • The same driver will do different things and touch different registers depending on the use (Clear Linux vs Ubuntu on COM1).

Looking at the precise register access counts, it also looks like the UART driver in Linux is inefficient in that for 859 characters sent and received (number of accesses to the rtb_drl register at offset 0x0), more than 10,000 accesses are made to the control and status registers (mcr, lsr, and msr). 

Final Notes

More research would be needed to fully understand the differences, for example, when combining and comparing the device register coverage with code coverage for drivers (where source code is available). Do unused registers correlate to unused corners of a driver, or are they simply registers that were not considered useful by the driver developer? The possibilities are almost endless, but advances in virtual platform technology—such as device register coverage—offer us the opportunity to further advance our understanding of these complex system interactions.

This article provides a simple example of the kinds of software insights gained with a virtual platform. Simics 6 is bringing new capabilities in this field to Simics, making it easier to collect more information while having lower performance impact. 

Related Content

Using Clear Linux* for Teaching Virtual Platforms: Moving to Clear Linux was really all about how to configure and use a modern Linux.

Question: Does Software Actually Use New Instruction Sets? Another investigation into software behavior, using Simics target system inspection and statistics features.

Containerizing Wind River Simics® Virtual Platforms (Part 1): Developers can gain major benefits from using containers with Wind River* Simics* virtual platforms.

Using Wind River* Simics* with Containers (Part 2): Wind River Simics has advantages over using hardware for debugging, fault injection, pre-silicon software readiness, and more.

Intentional and Accidental Fault Injection in Virtual Platforms: Using a virtual platform makes it much easier to provide cheap, reliable, and repeatable fault injection for software testing.

Author

Jakob EngblomDr. Jakob Engblom is a product management engineer for the Simics virtual platform tool, and an Intel® Software Evangelist. He got his first computer in 1983 and has been programming ever since. Professionally, his main focus has been simulation and programming tools for the past two decades. He looks at how simulation in all forms can be used to improve software and system development, from the smallest IoT nodes to the biggest servers, across the hardware-software stack from firmware up to application programs, and across the product life cycle from architecture and pre-silicon to the maintenance of shipping legacy systems. His professional interests include simulation technology, debugging, multicore and parallel systems, cybersecurity, domain-specific modeling, programming tools, computer architecture, and software testing. Jakob has more than 100 published articles and papers and is a regular speaker at industry and academic conferences. He holds a PhD in Computer Systems from Uppsala University, Sweden.   

standard
Para obter informações mais completas sobre otimizações do compilador, consulte nosso aviso de otimização.