By Prasad Kakulavarapu
Linux* kernel compilation presents a workload that represents a common software development task, and is included in standard benchmark suites by trade publications to test CPU and system performance.
The purpose of this document is two-fold: to demonstrate parallel build of the Linux kernel; and to evaluate the Intel® Extended Memory 64 Technology (Intel EM64T) performance benefit on the Intel processors. This study is based on 3.6 GHz Intel Xeon® processor with Intel EM64T.
Intel EM64T is an enhancement to Intel IA-32 architecture. An IA-32 processor equipped with this technology is compatible with the existing IA-32 software. This enables the software to access more memory address space, and allows for the co-existence of software written for the 32-bit linear address space with software capable of accessing the 64-bit linear address space.
A minor configuration change on the Intel EM64T platforms, enabling Hyper-Threading Technology (HT Technology) and building the Linux kernel in multistream mode (by adding a single parameter to the build process), delivers significant performance benefit over the default configuration and build process. Several key results indicate a performance benefit with HT Technology turned on, and from Intel EM64T.
Linux Kernel 2.6.4*, which is freely available, is evaluated in this study. Red Hat EL 3.0 distribution is used on all hosts. All Intel platforms considered in this study are enabled with the HT Technology and include DP 3.6GHz Nocona, and 3.2GHz Intel Xeon platforms.
Following are the key objectives of this paper:
- To evaluate the HT Technology benefit with Intel processors for multistream Linux kernel build.
- To review Linux kernel build performance on Intel processors with Intel EM64T.
Building the Linux Kernel
Building the Linux kernel is typical of a common computationally intensive task and is often required for Linux system maintenance. System administrators and Linux developers are frequently required to update the kernel, add new device drivers, or rebuild the kernel with application specific hooks. This is also mission-critical in the enterprise server space, where system availability is of highest priority.
One way to speed up the Linux kernel build is to use multistream builds and exploit the multiple CPUs on the host system. Multistream builds allow faster build time and improve system availability by reducing the time during which the system is unavailable due to a rebuild in progress. Intel processors with HT Technology enabled deliver two logical processors (for every physical processor) that can execute different tasks simultaneously using shared hardware resources. HT Technology enables parallel build of the Linux kernels resulting in faster build times by effectively utilizing the CPU resources. An important application of this performance benefit can be seen for enterprise mission-critical systems where minimal CPU time can be afforded for system administration tasks. For instance, a Linux enterprise server participating in an online system management utility such as Red Hat* Network or Aduva Director*, downloads the new kern el sources, and builds and installs the new kernel with the help of a local client.
Linux Kernel Benchmarks
Based on the characteristics of the Linux kernel build benchmark to test CPU and system performance, it has been included in the standard CPU benchmark suite by several publications, including those listed below:
- Linux 2.6 and Hyper-Threading - Kernel Compiles and MP3 Encoding*
- Linux: Benchmarking HT Performance On A Single Processor*
This section reviews the build environment, workloads, and the performance metrics used to compare performance.
All systems ran the standard Red Hat EL3.0 distribution (2.4.21-13ELsmp). The kernel and other critical components of Linux system, including the gcc compiler, are set up for parallel build, so no additional preparation is required to build the kernel in a parallel fashion. The –j option with the make utility is used to make multistream builds of the kernel. This option allows running multiple make threads to run simultaneously.
In this paper, the number of threads is set to the maximum number of logical processors available on a platform; (ie, - #CPU=2 for a DP platform with no HT Technology, and #CPU=4 for a DP platform with HT Technology turned on.)
- x86 and x86-64 environments
- DP 3.6GHz Nocona platform with HT Technology on (#logical processors = 4)
- DP 3.6GHz Nocona platform with HT Technology off (#logical processors = 2)
- x86 environment only
- DP 3.2GHz Intel Xeon® platform with HT Technology on (#logical processors = 4)
- DP 3.2GHz Intel Xeon platform with HT Technology off (#logical processors = 2)
The performance metric is based on the elapsed time, which is measured in seconds, for compiling the kernel. Timing the make –j n command measures the elapsed time spent in the core part, while building the kernel. This does not include the time spent on menu configuration, creating dependencies and other modules. The UNIX* time command is used to measure the elapsed time. Following is a typical instantiation of the make command:
The amount of computational work done in compiling the Linux kernel for a particular platform may depend on the configuration of the platform – both hardware and the build environment (gcc, glibc). An effort is made to maintain a uniform software build environment on all Intel platforms.
On the hardware side, identical kernel configuration files are maintained on all platforms. The key differentiator from the default is that the Pentium-4/Celeron (P4-based)/Pentium-4 M/Xeon processor family is selected on the x86 build, and the kernel is built for the Intel x86-64 processor family on the x86-64 mode.
Additionaly, the number of files compiled is used as a metric to approximate the amount of computational work done on all platforms. This number is obtained by measuring the number of gcc compiler invocation commands in the standard output. Following is the command to approximate the amount of computational work done:
How to Run and Reproduce the Test
Building 2.6.4 kernel:
Step 1. Download the 2.6.4 kernel (Linux-2.6.4.tar.gz) from www.kernel.org*.
Step 2. Uncompress the kernel file: gunzip Linux-2.4.13.tar.gz.
Step 3. Untar the tarball: tar –xvf Linux-2.6.4.tar.
Step 4. Move the tar ball to the active directory: cd $HOME/Linux-2.6.4.
Step 5. Run the script: ./lkb.sh n (where n is the number of parallel threads to build the kernel).
The lkb.sh script does the following:
- Prepare the configuration file: make menuconfig (select the appropriate Processor Family - Intel x86-64 in the x86-64 mode, and Pentium-4/Celeron (P4-based)/Pentium-4 M/Xeon in the x86 mode on both Intel and non-Intel platforms).
- Prepare dependencies: make dep.
- Remove any object files: make clean.
- Start the build: time sh –c ‘make –j n’ (where n is the number of parallel jobs building the kernel).
Two aspects of these results are discussed in this section, namely the HT Technology performance benefit, and its impact on the performance of Intel EM64T.
Hyper-Threading Technology Performance Benefit
This section reviews the kernel build performance of the HT Technology enabled Intel platforms for the 2.6.4 Linux kernel.
Figure 1 displays the HT Technology performance benefit on the 3.6GHz Nocona (x86 and x86-64 environments), and the 3.2 GHz Intel Xeon® platform.
Figure 1. Enabling HT Technology improves the performance on Intel platforms and the 2.6.4 Linux kernel tested
Intel® EM64T Performance Benefit
The Intel EM64T technology enables running 64-bit applications on 32-bit Intel® Xeon platforms. Switching to the x86-64 environment from the default x86 mode results in a 64-bit software environment, yeilding to better results for the Linux* kernel 2.6.4 compile benchmark.
The following table displays an improvement in performance to compile the Linux kernel 2.6.4 in the x86-64 mode when compared to the x86 mode. While the amount of work done is not the same on these two platforms (as indicated by the number of files compiled), the x86-64 environment enables a quicker compile time for the same kernel on the same platform.
Figure 2 displays the number of kernel source files compiled on the x86 and x86-64 platforms. The number of files compiled is uniform in the x86 OS environment, and higher than the x86-64 OS environment.
This paper reviewed the Linux kernel build performance for the 2.6.4 kernel version on the Intel platforms. Key results of this paper are as follows:
- Build times of the Linux Kernel 2.6.4 are measured on the Intel platforms and documented.
- A minor configuration change on the Intel platforms (enabling the HT Technology) and building the Linux kernel in multistream mode (by including the optional n parameter to the make command) delivers an additional increase in the performance benefit over the default configuration and build process.
- Compiling the 2.6.4 Linux kernel in the x86-64 OS environment is faster than in the x86 OS environment.
In conclusion, Linux kernel build using Intel processors with Intel EM64T and HT Technology enables better compile times for the 2.6.4 Linux kernel. HT technology enabled systems not only enable faster build times, but also minimizes system downtime scheduled for kernel compile.
- More information on Linux 2.6 can be found at The Linux 2.6 Kernel Trilogy Ends: Go Configure*.
- Find more resources for Linux developers* at the Linux zone.
- Building the Linux Kernel 2.6* is discussed in the various Linux newsgroups.
- Threading for Multi-Core D evelopment Community
About the Author
Prasad Kakulavarapu is an Application Engineer in the Parallel and Distributed Solutions Division, DuPont. He supports enabling of ISV applications on Intel desktops and server platforms. Prior to Intel, Prasad received his M.Sc (Computer Science) at McGill University, Montreal.