HTTP Compression: Unlock the potential of Web servers to improve application response time and reduce bandwidth costs.
Today’s Web sites are rich in content and provide capabilities similar to a full-fledged client application. Providing rich capabilities has an undesired side effect of increasing the page size. In most cases, the factor that determines the success of a Web application is the responsiveness. The best way to keep the site responsive is to minimize the size of the downloaded pages. Considering the fact that most of the content in a page is textual data, HTTP compression is a promising scheme to improve the response time. Compression algorithms are effective with text data and can achieve 70-80% compression. In this article, I intend to provide some insight into how Internet Information Services* 6.0 compression improves the overall performance while answering questions on impact to:
- CPU, network utilization
- Application throughput (requests per second)
- Response time
- Static and dynamic applications
- Sites that are SSL-enabled
Microsoft IIS 6.0, HTTP, ASP.NET*
Even though we are interested in assessing the impact of enabling compression on a specific application, it is important to understand that we are really measuring the impact on server resources by enabling compression. With that in mind, we are going to a use a sample application to characterize the performance before and after compression. In addition, we will use the same application to compare performance over SSL.
In order to measure the impact of compression, the following observations can be made:
- The compression algorithm is dependant more on the type of data (text, binary, and so forth) and size of data than the specific application functionality. As our focus is on measuring the impact of compression, it doesn’t matter whether we use HTTP GET or POST. For our testing we will be using HTTP GET.
- Load was generated using the Microsoft Application Center Test* (ACT) stress testing tool. It is important to note that ACT does not understand compressed content, so HTTP response status codes were utilized to determine if the request was handled successfully. ACT was also used to gather CPU utilization performance counters and to measure the size of the response.
- Not all clients are capable of understanding compressed content. IIS looks into the HTTP request header to determine client capabilities. Client indicates the ability to understand compressed data using Accept-Encoding HTTP header.
Example: Accept-Encoding=”gzip, deflate”
- For testing static content, three HTML files - small.htm (18KB), medium.htm (25 KB), and large.htm (50 KB) - were used. For dynamic content testing, an ASPX page was used; the number of rows to be returned was specified using a query string parameter. For dynamic testing, rows of size 10 (7 KB), 50 (25 KB), and 100 (50KB) were used.
Figure 1. Compression Flow
Web server: Dual Intel® Xeon® processor-based server (700 MHz)
Load Generator: Dual Intel Xeon processor-based server (550 MHz)
The effectiveness of a compression algorithm largely depends on the repetitive patterns that occur within the file. One interesting point to note is that encrypted content can never be compressed. We will touch on this topic later in this article when reviewing compression performance with SSL. This table below gives a rough idea about the effectiveness of a compression algorithm such as gzip with various file types.
Table 1. Compression Effectiveness
Network Communication Overview
Let's quickly review how data traverses the network. At a very high level, when there is a data to be transferred, it is initially queued in the machine (Queuing Delay). The queued data is transferred into the wire by the network controller and it takes a small amount of time to actually push the data (Transfer Delay) into the wire. The physical distance between source and destination, number of hops involved, and the communication medium (optic fiber, coax, microwave, satellite, and so forth) dictate the duration it takes to deliver the data at the destination (Propagation Delay). In order to provide a reliable network communication, TCP/IP protocol uses packet acknowledgement, retransmission, and windowing mechanisms. Depending on the capabilities of the network card, the host processor will end up sharing some the network communication responsibilities. The more data we have, the more transmission overhead and potential for missed packets.
Performance with Static Files
The behavior of IIS compression with static files is well documented in Utilizing HTTP Compression*. When static file compression is enabled, static files are compressed on demand and stored in a temporary folder. The compressed content is reused for subsequent requests to that file.
Figure 2. Static File Compression Effectiveness
The above figure compares the page size with and without compression. The compression achieved ranged anywhere from 72% to 88%. In addition to reduction in the page size, the Web server throughput increased anywhere from 66% to 900% after enabling compression (see figure 3).
Figure 3. Static File Throughput
Table 2. Static File
It is important to note that while the server spends additional CPU cycles to compress the data, it is doing it once and reusing the compressed data for all the other requests. The CPU utilization dropped by around 60% after compression was enabled. The page download time at the client improved by 40-50%.
With dynamic extensions, the page response is unique for every request and response is compressed on every call. Clearly, the server has to do a lot of additional processing. As in static files, the compression achieved varied from 70-85%.
Figure 4. Dynamic File Compression Effectiveness
Figure 5 compares the application throughput with and without compression. The results clearly indicate that the server was serving more requests when compression was enabled and the overall throughput increased by 30 to 190%.
Figure 5. Dynamic File Throughput
Table 3. Dynamic File
The average CPU utilization increased by 25-35% after compression was enabled. As in the static files, the page download time at the client improved by 40-50%.
SSL and Compression
The scenario with SSL needs a little explanation. One of the side effects in encrypting the data is that the encrypted output appears to be a random sequence of bytes. A compression algorithm works best when there are repeated patterns of data and unfortunately encryption totally messes up the pattern. If the compression happens after SSL encryption, then we might manage a 5% compression.
Fortunately, the designers of the HTTP compression scheme thought about this and made sure that compression happens before encryption. The SSL encryption is a processor-intensive operation. Even though some resource is spent in compressing the file, it is more than offset by the savings in encrypting a smaller file that is fraction of the original size. So, logically, there should be very minimal impact with doing both compression and encryption.
Figure 6. SSL and Compression Flow
Figure 7. Static File Throughput with SSL
With static files, even though the server reuses compressed data, it still has to perform encryption for each request. For smaller files, the encryption overhead is minimal and there was a 30-40% improvement in throughput. For large files, there was a 15-20% improvement. The CPU utilization dropp ed by 20% after compression was enabled.
Table 4. Static File with SSL
Figure 8. Dynamic File Throughput with SSL
Table 5. Dynamic File with SSL
With dynamic files, the application served more requests for smaller page sizes after compression was enabled. For larger page sizes, throughput was flat. There was a 10-20% increase in CPU utilization because of enabling compression on an SSL connection. The overall page size dropped 70-80%.
HTTP compression can be safely enabled on Web sites that use SSL because it has minimal impact on the server resources while significantly reducing the page size.
Issues in Enabling Compression
So far, we have seen the benefits of compression. It’s time to look at some of the issues in enabling compression in IIS. The first issue is that compression capability is enabled or disabled at the machine level. You can't selectively enable/disable compression on specific Web sites within a machine.
The second issue is around caching compressed data. The caching that is discussed here is HTTP header-based caching at proxies and browser caches and does not refer to ASP.NET caching. Generally, content can be cached at several points: reverse proxies, client proxies, and the browser. Not all content is suitable for caching at proxies. For example, SSL response is never cached in proxies; whereas, a browser may choose to cache the SSL response. While caching can significantly improve the download speed by serving the files locally, there are a few issues with enabling caching with compressed content.
The current implementation of IIS compression does not provide granular control on caching responses. You have the option to turn on/off cache headers using HcSendCacheHeaders node in metabase.xml. Refer to the MSDN documentation for additional help on each of these settings.
<IIsCompressionSchemes Location ="/LM/W3SVC/Filters/Compression/Parameters" HcCacheControlHeader="max-age=0" HcExpiresHeader="Wed, 01 Jan 1997 12:00:00 GMT" ... ... ... HcSendCacheHeaders="FALSE" >
The IIS compressed content caching scheme forces you to define the policies at the Web server level rather than the Web application level. A blanket caching scheme will never work for an application, nor for the Web server that is hosting several Web applications. If you configure it incorrectly, you may see dynamic responses being cached longer than expected. For example, you might see a day-old stock quote, or downloading a large image again and again. One option is to disable the cache instruction set by the compression routines (HcSendCacheHeaders=FALSE) and control all caching instructions programmatically for dynamic files. Another option is to enable cache instruction and make sure that the dynamic compressed content is not cached (HcSendCacheHeaders=FALSE and HcCacheControlHeader="max-age=0"). A third option is to compress just the dynamic application files and not static files. By doing this, you have the benefit of compressing all dynamic responses and take advantage of caching static files at proxies. Experiment with what works best for your application and choose an appropriate scheme.
Metabase.XML Compression Settings:
|HcSendCacheHeaders||Specifies whether the headers specified by HcCacheControlHeader and HcExpiresHeader are sent with each compressed response.|
|HcCacheControlHeader||Specifies the directive that IIS adds to the Cache Control header. The HcCacheControlHeader property specifies a header that overrides the HTTP Expires header.
Neither the header specified by HcCacheControlHeader nor the header specified by HcExpiresHeader is sent with a response if the metabase property HcSendCacheHeaders is set to false.
Instruction for not caching the data
Instruction for caching the data for 1 day (86400 seconds)
|HcExpiresHeader||Specifies the content of the HTTP Expires header that is sent with all requested compressed files, along with the Cache Control header discussed in HcCacheControlHeader.
The combination of HcExpiresHeader and HcCacheControlHeader ensures that older clients and proxy servers do not attempt to cache compressed files.
Instruction for not caching the data with absolute expiration:
HcExpiresHeader="Wed, 01 Jan 1997 12:00:00 GMT"
- Compression works well with text content (see Table 1). For images, instead of using bmp format, select a more efficient format like jpg, png etc. The jpg and png formats are opti mized and compressed and you can exclude these extensions from IIS compression.
- Analyze the type of data your Web site generally serves. Determine where compression can bring maximum benefits. Use IIS Log file to verify the size of a generated response. Configure IIS 6.0 by specifying the file extensions that you want to use for static and dynamic compression.
- Choose the compression level in Metabase.xml to optimize for speed or size. Recommended value is 9 or lower.
Today’s Intel® Xeon® processors and Itanium® processors are incredibly powerful. For a typical Web site that interacts a lot with the database, with these processors it will be rare to see a high CPU utilization because most of the constraint is in the network or other IO devices. This is exactly where you can use compression to tap into the “unused” CPU capacity to scale the infrastructure. In addition to reducing the page size, the application was able to serve more pages after enabling compression. Smaller files allow a Web server to complete network transfer operations faster while using less bandwidth, providing an opportunity to save on network infrastructure costs.
From an IIS implementation perspective, the management code for compression is part of the core IIS 6.0 Web server for speed and stability. In IIS 4 and IIS 5, compression was an ISAPI filter with limitations (see References).
- HOW TO: Enable ASPX Compression in IIS*
- IIS 6.0 Compression with Windows Server 2003: Do More with Less*
- MSDN Cache Header Help*
- Fundamentals of Web Site Acceleration*
- Utilizing HTTP Compression*
- RFC 1951 - DEFLATE Compressed Data Format Specification version 1.3*
- RFC 1952 - GZIP file format specification version 4.3*
About the Author
ChandraMohan Lingam (Chandra) is a Senior Application Developer at Intel Corporation. At Intel, Chandra specializes in performance tuning applications to make them fast and responsive while supporting large number of concurrent users. Chand ra has over seven years of development experience in various Microsoft technologies and Linux*.
Special thanks to the following reviewers: Thiru Thangarathinam, Jay Turpin, Patrick Logan, Tom Fieldhouse and Brandon Bohling.