Welcome to the continuation of our Vulkan* API-related series of articles. In the Introduction to Vulkan* tutorial, we learned the most important basics of this low-level graphics programming library. You already know how to write simple programs that use Vulkan. You know how to display things on screen. Now you can use that same knowledge to display more complicated scenes.
The Vulkan API is very coherent and as concise as possible, despite the fact that it is also very verbose and requires us to write huge amounts of code. Vulkan's learning curve is steep. It is extremely hard to write a basic program. But once you master it, you can develop more complicated scenes without learning a lot more—at least from the API perspective. Displaying vast and beautiful scenery isn't much harder than displaying a simple triangle. On the other hand, doing so in an optimal way is truly a challenge. That's why we thought it is a good time to move on with the tutorial and try another approach and another perspective.
More complicated scenes require a higher level approach to tasks such as resource management, multithreading, or synchronization. These tasks can be performed in many different ways, but you may not know which approach is the best one in your case. Sometimes you want the application to be portable across different platforms. Sometimes you want your code to be easily maintainable. Or maybe you want to prepare a set of tools that make future application development efforts easier. Each of us may have different priorities, but usually, when we talk about 3D graphics and 3D tools, the most common goal is the best performance. So, which path should you take to achieve the goal of increased performance, and how do you avoid losing performance?
Unfortunately, in Vulkan there is no simple answer. As with all low-level APIs, one solution can be well-suited for a given platform, but other platforms may require a totally different path. You can't prepare one solution for all platforms if you want to squeeze every bit of performance out of your targeted graphics hardware. Each platform has its individual characteristics—architecture, memory, abilities, features, limits—and Vulkan exposes them all. What's more, we are responsible for using these characteristics appropriately. Fortunately, some topics and issues are common across platforms. The focus of these articles is on such commonalities, and on the practical side of using Vulkan in our applications. We therefore test various areas of this graphics library and explain them with more depth.
At the same time, we want to give you tools that provide explicit knowledge about how to approach each specific area, and how to fine-tune your own applications. To do this, we show you code samples that focus on very specific aspects of the Vulkan API, including command buffers, descriptor sets, memory types, buffer and image resources, pipelines, shaders, and more. Complicated scenes require you to smartly use and manage all of these small parts simultaneously. Because there are multiple dependencies between them, it is crucial that you have a good understanding of each area.
Each code sample is prepared as a separate project that you can compile and execute on Windows* and Linux* operating systems. In addition, each sample exposes a set of parameters. You can adjust the parameters at runtime to see how they influence the application's behavior.
The code for this series of articles is freely available on the GitHub* repository. Unfortunately, you still have to write huge amounts of code when you use Vulkan. What's more, you can't create a universal code set and adjust it for specific purposes because of dependencies between various resources. Consider the following situation: Let's say you want to draw a simple scene. For that, you need a render pass and pipeline, among other resources. Pipeline creation requires you to specify in which sub pass of which render pass the pipeline will be used. If you want to modify a render pass setup, you may also need to recreate the pipeline in a different way. That's why you might find the code to be somewhat repetitive. To shorten it, we develop samples using the vulkan.hpp header file, which is a C++ wrapper for all Vulkan objects and functions. The wrapper allows you to use exceptions, default function parameters or automatic resource destruction. This wrapper file is distributed with the Vulkan SDK.
We also prepared a simple helper functions to bring the focus more on what we want to do rather than how to do it. The goal is for you to understand the code and learn Vulkan from these examples without the need to jump between multiple files or to search somewhere else.
The code structure for each sample looks like this:
This project would be much harder and more time consuming to prepare without external libraries that contain files for loading image data (stb_image) and for displaying the user interface (Dear ImGui*).
The most apparent parameter of an application's behavior is its framerate. Performance drop or increase is one of the most important factors for application developers. That's why sample programs allow you to check the number of frames generated per second. In these articles, we don't want to present exact, absolute results of measurements, as the measurements depend on a given hardware platform, available memory, the operating system, software installed and executed in the background, power management, accessories attached to the computer, number of displays, and multiple other factors. Instead, we want to show which parameters may influence the performance of your application, and in what way. In each topic discussed, the goal is to indicate which approach to take, how to implement the approach, what parameters are related to the topic, and how parameters may influence the application.
Figure 1: Example of a window with adjustable parameters.
Each sample program in these articles exposes a set of parameters that can be adjusted at runtime to see how they impact both the behavior and performance of the application. There is a broad discussion about which metric is better for performance measurements—frames per second (fps) or frame generation time—and each has its pros and cons. Time is more general and valuable for developers, while fps is more important for end users. Performance drops are more easily perceived in fps, but the conversion isn't linear.
For example, when we have 60 fps and performance drops to 50 fps, we lose 16.67 percent of our fps measurement. However, the same loss converted to time (60 fps means 16.67 ms and 50 fps means 20 ms) is equal to 20 percent longer frame generation time.
60 FPS == approximately 16.67 ms
50 FPS == 20 ms
100% * (50 – 60) / 60 = -16.67% [FPS]
100% * (20 – 16.67) / 16.67 = +20% [ms]
To please those preferring both metrics, performance is measured and presented in fps, and frame generation time in milliseconds. The measurements are averaged, based on the last 10 seconds of an application's execution, to show stable results. You can also see the trend from a 10-second history. Just remember that, in this series of articles, the absolute performance isn't the most important concept. What counts are the relative changes in performance and, even more important, the general behavior of our application.
When you observe performance variations and the behavior of your application, we recommend that you also monitor power consumption. Power management is especially important for developers targeting mobile devices and for users running software on small form factor devices. Low power consumption is crucial on such platforms, so it is also important to know how the power management function works—and essential to keep in mind that power management may influence the performance of the application.
In a situation where you lower the CPU and graphics processing unit (GPU) workload, you would usually expect to see increased performance. In this case, however, you may actually see a performance drop. How is that possible? It's because the CPU or GPU have less work to do and so, depending on its setup, power management may lower the CPU or GPU frequency to reduce overall power consumption. In other words, if the power management function decides not to waste power on simple tasks, it lowers the CPU or GPU frequency. This would be desirable behavior for extending the battery life of a mobile device, for example.
But even if your goal isn't the lowest possible power consumption, we advise monitoring the CPU and GPU workload and being aware of power management and its current setup. For development purposes, you can disable power management or switch it to maximum performance mode to be sure that changes in your application are correctly reflected in its performance. Although Vulkan-related design decisions can have an impact on your application, remember that these design changes represent only one of the many factors influencing your application's behavior and performance.
Do you have any questions or comments about Vulkan or about the articles? Or maybe you have ideas for new tests? Is there a Vulkan-related topic that is especially interesting? Do not hesitate to write a comment. We will do our best to prepare additional code samples and articles. We hope that publication of this series starts an open discussion relating to the Vulkan API, and low-level graphics libraries in general.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804