<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated on Fri, 10 Feb 2012 12:47:29 -0800 -->
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <atom:link href="http://software.intel.com/en-us/articles/home/type/code/feed/" rel="self" type="application/rss+xml" />
    <title>Intel Software Network articles Feed</title>
    <link>http://software.intel.com/en-us/articles/home/type/code/</link>
    <description></description>
    <language>en-us</language>
    <item>
      <title>Ultrabook™ and the Intel® Energy Checker SDK</title>
      <description><![CDATA[ <h2 class="sectionHeading">Abstract</h2>
With the advent of the Ultrabook™<sup>1</sup>, the demand for applications that are power misers continues to rise. The Intel® Energy Checker SDK can be used to instrument an application and collect data to help a developer pinpoint power hungry features that can be optimized for power. This article gives an overview of the Intel Energy Checker SDK and discusses how it can be used to advantage when improving energy usage on an Ultrabook.<br /><br />
<h2 class="sectionHeading">More Work, Less Power</h2>
An Ultrabook™ needs to budget its power consumption very carefully to extend usefulness while running on battery. Therefore, applications that use less energy are preferred. Often, application developers create their program on a desktop system where power/energy consumption is less important than raw performance. Not only should applications be developed to conserve power when active, they should also be developed to minimize energy usage during program idle periods, this is often overlooked and can greatly extend battery life. If power issues are ignored, running a program on an Ultrabook will result in unpleasant surprises for the user. If developers test their application on an Ultrabook system during development, they will gain insight into how well the program runs in a power limited environment. An analysis tool such as the <a href="http://software.intel.com/en-us/articles/intel-energy-checker-sdk/">Intel® Energy Checker SDK</a> can be a powerful companion during the optimization phase for software designed for an Ultrabook.<br /><br />
<h2 class="sectionHeading">Energy Efficency</h2>
Before explaining what Intel Energy Checker SDK contains, a discussion on Energy Efficiency (EE) is in order. This is a term that is used extensively in the Intel Energy Checker SDK. There is no universally accepted definition of EE, so for the purposes of this tool it is defined as:<br />
<p ><em>EE=Work/Energy</em></p>
<em>Work</em> is defined as the amount of “<em>useful work</em>” done by a software application. There is no concise, easy definition of the term <em>useful work</em> either, as what is considered <em>useful work</em> in one program may be quite different in another application. The developer is required to make that determination. For example, one might consider the areas of a movie player program where it provides the customer value (such as decoding the movie) as useful work whereas areas of the program that are accessing resources, waiting on input, or performing synchronization would not.<br /><br />
<h2 class="sectionHeading">Code Instrumentation</h2>
The first step in using Intel Energy Checker SDK to help determine an application’s EE is to create and use “counters” in the software to determine quantities of “useful work”. A counter is defined as a 64-bit (8 byte) variable that keeps a running total of how many times a particular event occurs. In the “C” language, this becomes an unsigned long long data type. A developer can create one or more counters during the initialization portion of the software. Next, a container for the counters can be created, called a “Productivity Link” (PL)<sup>2</sup>. Each PL holds up to 512 counters, and up to 10 different PL’s can be open at one time, but most software will require far smaller numbers of counters and PL’s.<br /><br />During the application runtime, values can be written to any counter in the PL, based on the developer’s requirements. Intel Energy Checker SDK can collect the information from the PL’s in order to determine how much work was done.<br /><br />
<h2 class="sectionHeading">Energy Consumed</h2>
The second part of finding the EE of a software application is to measure how much energy was consumed while the program was running. To do this, Intel Energy Checker SDK uses two tools which are included in the SDK download: Energy Server (ESRV) and Temperature Server (TSRV). ESRV is used to monitor energy and power consumption as reported by external power tools while TSRV monitors temperature related information as reported by environmental probes. ESRV and TSRV counters can be accessed by any program using the Intel Energy Checker API. In addition to the counters created by the developer to determine quantities of work, the developer will want to add counters to collect information from ESRV and possibly TSRV. There are three different ways to set up ESRV:<br /><br /><ol>
<li>Use a power meter to collect actual “platform energy and power” information.<br /><br />There are several different power meters that work with the Intel Energy Checker SDK. Please consult the <em>Intel® Energy Checker SDK User Guide</em> included in the download or found on the <a href="http://software.intel.com/en-us/articles/intel-energy-checker-sdk/">Intel® Energy Checker SDK page</a> to determine which power meters will work and how they should be attached to the test system.<br /></li>
<li>Use <a href="http://software.intel.com/en-us/articles/intel-power-gadget/">Intel® Power Gadget</a> to collect “processor energy and power” usage information on 2nd Generation Intel Core™ processor family. External power meters can also be used which report platform power together with Intel Power Gadget that provides processor power.The blog Accessing Intel® Power Gadget From Intel® Energy Checker SDK by Intel engineer Jun De Vega discusses how to enable Intel® Power Gadget with Intel® Energy Checker.<br /></li>
<li>Choose to use the simulation method which will use the CPU utilization percentage returned from the OS. This method does not require a hardware probe. The Intel Energy Checker SDK offers this method as an option for all processors (rather than just the 2nd Generation Intel Core processor family as with the Intel Power Gadget) in order for enable the user who does not have a power meter. Included in the SDK is a support library for accessing this metric.</li>
</ol>
<p ><img src="http://software.intel.com/file/41168" /><br /><br /><strong>Figure 1:</strong> Conceptualized drawing of Intel Energy Checker setup with Instrumented Application, Power Meter and Environmental probes attached</p>
<h2 class="sectionHeading">Intel Energy Checker Extras</h2>
There are two companion tools that are bundled with the Intel Energy Checker SDK in addition to those already mentioned. The PL GUI Monitor is a user interface that displays Productivity Link (PL) counters in a running program that has already been instrumented with the Intel Energy Checker API. The PL CSV Logger<sup>3</sup> is an application that can collect and write PL counters to a CSV file for later analysis in a variety of spreadsheet applications.<br /><br />Included with the Intel Energy Checker SDK is the <em>Intel® Energy Checker SDK Companion Application User Guide</em> that discusses the features and capabilities of both of these tools.<br /><br />
<p ><img src="http://software.intel.com/file/41169" /><br /><br /><strong>Figure 2:</strong> PL GUI Monitor running while a picture is being rendered</p>
The entire Intel Energy Checker SDK includes other build, scripting, interoperability, and monitoring tools to help developers instrument code and collect energy metrics.<br /><br />A white paper entitled “<em>How Green Is Your Software?</em>” is available for download from the SDK site. This paper discusses approaches for making software power efficient. Look for it in the “Code, Resources and Documentation” section of the <a href="http://software.intel.com/en-us/articles/intel-energy-checker-sdk/">Intel Energy Checker SDK page</a>. Several blogs about Intel Energy Checker that were written by Intel Engineer Jamel Tayeb will also be helpful:<br /><br /><a href="http://software.intel.com/en-us/blogs/2010/04/15/using-the-intel-energy-checker-sdk-at-home/?wapkw=(Energy+Checker)">Using the Intel® Energy Checker SDK at Home</a><br /><br /><a href="http://software.intel.com/en-us/blogs/2010/02/19/creating-a-simple-device-library-for-intel-energy-checker-sdk/?wapkw=(Energy+Checker)">Creating a Simple Device Library for Intel® Energy Checker SDK</a><br /><br /><a href="http://software.intel.com/en-us/blogs/2010/03/30/measuring-the-energy-consumed-by-a-command-using-the-intel-energy-checker-sdk/?wapkw=(Energy+Checker)">Measuring the energy consumed by a command using the Intel® Energy Checker SDK</a><br /><br />All of these resources allow a developer to get started in gathering helpful information.<br /><br />
<h2 class="sectionHeading">Optimizing Applications for Ultrabooks</h2>
Once a program has been instrumented to collect counter information and an energy collection plan is in place (either simulation or power meter), the setup is complete. The developer will then be able to gather information about the application’s energy usage profile and to incorporate optimizations to improve results.<br /><br />There are several areas of optimization the Ultrabook developer can select for improvements:<br /><br />
<div >Consider modifying the application to be aware of the power status and changing usage to reduce energy consumption when the system is on battery.<br /><br />Check the hardware and software system power management possibilities to choose a balanced power setting. This could be a recommended setting suggested in application documentation.<br /><br />Reduce power usage while the application is actively running or doing work. Compute intensive parts of the program will likely benefit from multi-threading and vectorization techniques.<br /><br />Reduce power usage while the application is idle. Being able to minimize the timer tick rate or setting up periodic actions to happen within the same wakeup period are examples of how to reduce idle application power usage.</div>
<br /><br />
<h2 class="sectionHeading">Summary</h2>
With the growth of Ultrabook devices, it will benefit program designers and developers to take a look at ways to save energy while providing a great user experience on an Ultrabook. Intel Energy Checker SDK can provide the means to identify the key areas of focus and confirm the positive results achieved after optimization. Long live Ultrabook!<br /><br />
<h2 class="sectionHeading">About the Author</h2>
<img src="http://software.intel.com/file/41170"  /> Judy Hartley is a Software Applications Engineer who has been working in the Software and Services Group since 2005. She has contributed to many software products and written about her experiences through blogs and whitepapers. Recently Judy has been working on Graphics and Power tools and training for future Intel processors.<br /><br  />
<hr />
<br /><sup>1</sup> Ultrabook is a trademark of Intel Corporation in the U.S. and/or other countries.<br /><br /><sup>2</sup> A Productivity Link is a term used by Intel Energy Checker to represent an arbitrary or logical collection of counters.<br /><br /><sup>3</sup> CSV is the acronym for Comma Separated Values.<br /><br /> ]]></description>
      <link>http://software.intel.com/en-us/articles/ultrabook-and-the-intel-energy-checker-sdk/</link>
      <pubDate>Tue, 24 Jan 2012 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/ultrabook-and-the-intel-energy-checker-sdk/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/ultrabook-and-the-intel-energy-checker-sdk/</guid>
      <category>Mobility</category>
      <category>What If Experimental Software</category>
      <category>Tools</category>
      <category>Intel Software Network communities</category>
      <category>Intel SW Partner program</category>
      <category>Code &amp; Downloads</category>
      <category>Power Efficiency</category>
      <category>Resources For Software Developers</category>
      <category>Ultrabook</category>
    </item>
    <item>
      <title>How to Automate Static Security Analysis with Intel(R) C++ Compiler for Linux*</title>
      <description><![CDATA[ <p>Automate the static security analysis check done by the Intel(R) C++ Compiler for Linux. Static security analysis is the process of finding errors and security weaknesses in software through detailed analysis of source code.<br /><br />An automated quality gate like this one can notably reduce code reviews efforts, and of course will decrease the likely of having bugs and security threats found once the product is in production. <br /><br />To automate the static security analysis as a quality gate in any project, execute the check without graphical user interface which requires human interaction.</p>
<p> </p>
<p>In the case of legacy projects, ask the developers to submit new code only if they reduce the number of findings.<br />In the case of coding from scratch, allow no findings before uploading new code in your repository.<br /><br />When enabling the check (<strong>-diag-enable sc3</strong>) and compiling the code, a new folder will be created where the findings will be stored using a custom XML format.</p>
<blockquote>
<p>$ file rXsc/data.X/rXsc.pdr<br />rXsc/data.X/rXsc.pdr: XML document text</p>
</blockquote>
<br />The xmlstar* package can be used to easily list the findings and the associated location information (file, line and function). The package provides a command line tool to process XML documents.<br /><br /><a href="http://xmlstar.sourceforge.net/">http://xmlstar.sourceforge.net</a><br /><br />The following line can be used to verify that no findings are found before proceeding with the usual development cycle. <br /><br />
<blockquote>
<p>$ xml sel -t -m /diags/diag -v "concat(message/thread/stacktrace/loc/file, ':', message/thread/stacktrace/loc/line, ':', sc_verbose)" -n rXsc/data.0/rXsc.pdr <br />/home/$USER/work/$PROD/src/pool.c:157:pool.c(157): warning #12178: this value of "ret" isn't used in the program<br />/home/$USER/work/$PROD/src/pool.c:186:pool.c(186): error #12192: unreachable statement<br />/home/$USER/work/$PROD/src/pool.c:216:pool.c(216): warning #12135: procedure "pool_done" is never caled</p>
</blockquote>
<p> </p> ]]></description>
      <link>http://software.intel.com/en-us/articles/how-to-automate-static-security-analysis-with-intelr-c-compiler-for-linux/</link>
      <pubDate>Fri, 13 Jan 2012 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/how-to-automate-static-security-analysis-with-intelr-c-compiler-for-linux/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/how-to-automate-static-security-analysis-with-intelr-c-compiler-for-linux/</guid>
      <category>Tools</category>
      <category>Intel Software Network communities</category>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Resources For Software Developers</category>
    </item>
    <item>
      <title>Using Intel® TBB in network applications: Network Router emulator</title>
      <description><![CDATA[ <p><b>Introduction</b></p>
<p>Intel® Threading Building Blocks is used in wide range of applications. If performance makes sense and multi core platform is used, TBB is good thing to be added to C++ program. Network applications are usually highly-loaded as they process huge amount of traffic and processing time constraints are high. This article is intended to show how TBB can be used in network packet processing software, improving its productivity and processing time.</p>
<p>For a sample project I've created a simplified Network Router emulator. Network Router is a device that routes and transmits IP (Internet Protocol) packets in local area network (LAN). It connects several PCs, provides them access to Internet and internal network. The device has several internal network interfaces and one external.</p>
<p>The sample project emulates Network Router logic. It provides the following functionality:</p>
<ul>
<li>Input packets from file - the application is just a model so there is no need for real interconnection with network interface. Reading from file emulates real reading from network interface.</li>
<li>NAT - Network Address Translation. The router has only one external IP address, but packets should be delivered to several internal devices behind the router. NAT allows port and IP mapping from external to internal and vice versa.</li>
<li>IP routing - delivering packets to appropriate router NIC (Network Interface Controller) according to destination IP.</li>
<li>Bandwidth management - some traffic is real time and it's critical to deliver these packets as quick as possible (e.g. voice over IP). The VoIP protocols maintain telephone conversation and delays would degrade quality. The router can prioritize these critical packets so they can be processed quicker.</li>
</ul>
<p>I've created two versions of Network Router: serial and parallel. The latter uses Intel® Threading Building Blocks. I'll describe how TBB was used in the project and will provide performance results of the program parallelization.</p>
<p><b>Network Router implementation</b></p>
<p>Network router emulator gets packets from file and processes them. Packet processing includes Bandwidth management, NAT translation and IP routing. Packets are processed by several program modules. These processing modules are ordered sequentially, like in assembly line. This is common composition of packet processing application. Input file is a text file, each line represents one IP packet. There is separate thread that reads packets by big chunks.</p>
<p>Intel® TBB has tbb::pipeline class that provides high level framework for such kind of program structure. It has filters that process packets on each stage. Each packet goes through the pipeline and is processed step by step by its filters. One packet is processed sequentially - from first filter to second, than third, etc. However processing of one packet is independent from another, so filters can operate in parallel.</p>
<p ><br />Network Router scheme<br /><img height="256" width="531" src="http://software.intel.com/file/36534"  /></p>
<p><br /><br />Main function:</p>
<pre name="code" class="cpp">#include &lt;iostream&gt; 
#include &lt;sstream&gt;
#include &lt;fstream&gt;
#include &lt;vector&gt;
#include &lt;algorithm&gt;
#include &lt;ittnotify.h&gt;
#include &lt;tbb/pipeline.h&gt;
#include &lt;tbb/concurrent_hash_map.h&gt;
#include &lt;tbb/atomic.h&gt;
#include &lt;tbb/concurrent_queue.h&gt;
#include &lt;tbb/compat/thread&gt;
// Redirects calls to "new" and "delete" to TBB thread safe allocators
#include &lt;tbb/tbbmalloc_proxy.h&gt;

using namespace tbb;
using namespace std;

class bandwidth_manager_t;
class network_adress_translator_t;
class ip_router_t;
class compute_t;
typedef vector&lt;packet_trace_t&gt; packet_chunk_t;

int chunk_size = 1600;
concurrent_queue&lt;packet_chunk_t&gt; chunk_queue;
atomic&lt;bool&gt; stop_flag;

int main(int argc, char* argv[])
{
	ip_addr_t external_ip;
	nic_t external_nic;	
	nat_table_t nat_table;	// NAT table   
	ip_config_t ip_config;	// Router network configuration 					
	int ntokens = 24;	
	
	get_args (argc, argv);	
    ifstream config_file (config_file_name);

    if (!config_file) {
        cerr &lt;&lt; "Cannot open config file " &lt;&lt; config_file_name &lt;&lt; "\n";
        exit (1);
    }		
	if (! initialize_router (external_ip, external_nic, 
                            ip_config, config_file)) exit (1);	
	
	thread input_thread(input_function);

	// packet processing objects
	bandwidth_manager_t bwm;	
	network_adress_translator_t nat(external_ip, external_nic, nat_table);
	ip_router_t ip_router(external_ip, external_nic, ip_config);		

__itt_resume();
	bool stop_pipeline = false;	
	
	parallel_pipeline(ntokens,		
		make_filter&lt;void, packet_chunk_t*&gt;(		// Input filter
			filter::parallel,
			[&amp;](flow_control&amp; fc)-&gt; packet_chunk_t*{				
				
				if (stop_pipeline){					
					fc.stop();
				}				
				packet_chunk_t* packet_chunk = new packet_chunk_t(chunk_size);
					
				if(!chunk_queue.try_pop(*packet_chunk)){				
					if (stop_flag) {
						stop_pipeline = true;
					}
				}				
				return packet_chunk;
			}
		)&amp;	// Bandwidth manager filter
		make_filter&lt;packet_chunk_t*, packet_chunk_t*&gt;(		
			filter::parallel,
			[&amp;](packet_chunk_t* packet_chunk)-&gt; packet_chunk_t*{								
				
				for(int i=0; i&lt;packet_chunk-&gt;size(); i++){
					packet_trace_t packet;
					packet = (*packet_chunk)[i];				
					
					if (packet.nic == empty){
						break;
					}
					else{
						bwm.prioritize(packet);									
						compute_t compute;
						compute.work();						
					}										
				}
				std::sort(packet_chunk-&gt;begin(), packet_chunk-&gt;end(),
							packet_comparator);
				return packet_chunk;	
			}
		)&amp;	// NAT filter
		make_filter&lt;packet_chunk_t*, packet_chunk_t*&gt;(	
			filter::parallel,
			[&amp;](packet_chunk_t* packet_chunk)-&gt; packet_chunk_t*{

				for(int i=0; i&lt;packet_chunk-&gt;size(); i++){	
					packet_trace_t packet;

					packet = (*packet_chunk)[i];					
					if (packet.nic == empty)
						break;
					else{				
						nat.map(packet);
						compute_t compute;
						compute.work();	
					}
				}				
				return packet_chunk;
			}
		)&amp;	// IP routing filter
		make_filter&lt;packet_chunk_t*, packet_chunk_t*&gt;(		
			filter::parallel,
			[&amp;](packet_chunk_t* packet_chunk)-&gt; packet_chunk_t*{			

				for(int i=0; i&lt;packet_chunk-&gt;size(); i++){						
					packet_trace_t packet;
					packet = (*packet_chunk)[i];
					
					if (packet.nic == empty)
						break;
					else{				
						ip_router.route(packet);
						compute_t compute;
						compute.work();	
					}
				}				
				return packet_chunk;
			}
		)&amp;	// Output filter
		make_filter&lt;packet_chunk_t*, void&gt;(	
			filter::parallel,
			[&amp;](packet_chunk_t* packet_chunk){														
				
				for(int i=0; i&lt;packet_chunk-&gt;size(); i++){						
					packet_trace_t packet;
					packet = (*packet_chunk)[i];	
					compute_t compute;
					compute.work();	

					if (packet.nic == empty)
						break;
				}	
				// No output is required , just drop packets
				delete packet_chunk; 
			}
		)
	);	
__itt_pause();

	cout &lt;&lt; "\nAll packets are processed\n\n";		
	return 0;
}</pre>
<br />
<p>First part is "preparation" - creating objects, reading command line, opening files and initializing. Configuration file contains router interfaces info. Objects bwm, nat and ip_router are packet processing objects. They use containers nat_table and ip_config for storing NAT and IP tables.</p>
<p>The core component of Network Router is pipeline. It is implemented using tbb::parallel_pipeline() function, that takes number of tokens and list of filters as arguments. The element of work that is passed through the pipeline is of type packet_chunk_t. Parameter ntokens controls maximum number of concurrently processed elements. It has value 24 because the project was tested on 24-core machine and making it bigger wouldn't make an effect.</p>
<p>Pipeline filters perform some work execution, particularly packet processing in this application. Filters can be serial or parallel. This mode is controlled by filter parameter that is filter::parallel for all filters. This means that any filter can process some elements at the same time.</p>
<p>First filter extracts packet chunk from chunk_queue and passes it to second filter. Second filter performs bandwidth management operations on each packet from chunk. bwm module assigns priorities to packets according to protocol. Then packets in chunk are sorted by priority. This allows critical traffic to be processed as early as possible.  Subsequent filters make NAT mapping and IP routing. Last filter is output, but for simplicity real output is not done. Packets are just dropped.</p>
<p>Packet chunk is used as pipeline token because it's big enough. If single packets were passed through pipeline there would be too much transitions between threads, and overhead would be bigger than positive effect.</p>
<p>The __itt_resume() and __itt_pause() functions are used by Intel® VTune<sup>TM</sup> Amplifier XE that was used for performance measurements. These API functions mark the beginning and the end of area of interest.</p>
<p>Object compute of type compute_t makes workload for CPU. It just performs additional computations to simulate computing in real systems. The application doesn't perform the entire job needed for processing and routing packets in real life network equipment. It is just model framework of real application, so there is not enough CPU usage. Method compute_t:: work()starts computing "N Queens" algorithm.</p>
<p>Input file opening and reading is a job of separate thread. It is instantiated using std::thread class that is a part of new upcoming C++ 11 standard.</p>
<p><b>Serial implementation</b></p>
<p>To understand effect from parallelization a serial version was created. It has similar structure. The only difference is that parallel_pipeline is replaced with simple while loop.</p>
<p >Network router serial scheme<br /><br /><img height="248" width="459" src="http://software.intel.com/file/36533" /></p>
<p>While loop (replacing parallel_pipeline):</p>
<pre name="code" class="cpp">__itt_resume();
	bool stop = false;

	while (!stop){
		packet_chunk_t packet_chunk(chunk_size);
		
		if(!chunk_queue.try_pop(packet_chunk)){				
			if (stop_flag) {
				stop = true;
			}
		}		
		
		for(int i=0; i &lt; packet_chunk.size(); i++){
			packet_trace_t packet = packet_chunk[i];;			
			bwm.prioritize(packet);	
			compute_t compute;
			compute.work();									
		}
		std::sort(packet_chunk.begin(), packet_chunk.end(), packet_comparator);
		for(int i=0; i &lt; packet_chunk.size(); i++){
			packet_trace_t packet = packet_chunk[i];				
			nat.map(packet);
			compute_t compute;
			compute.work();		
			ip_router.route(packet);				
			compute.work();							
			compute.work();								
		}
	}
__itt_pause();</pre>
<p><br />There are four calls of compute.work() - the same number as in TBB version. This is going to be the most CPU time consuming function, so it's fair to have same number of calls to it.</p>
<p><b>Data structures</b></p>
<p>Input file has the following format:</p>
<p class="code">eth3 104.44.44.10 10.230.30.03 4003 5003 ftp<br />eth3 104.44.44.10 10.230.30.03 4003 5003 rtp<br />eth0 134.77.77.30 104.44.44.10 2004 4003 sip<br />eth3 104.44.44.10 10.230.30.03 4003 5003 http</p>
<p>Each line represents one packet. It has network interface, source, destination IP and port, protocol. Packet is stored in packet_trace_t structure:</p>
<pre name="code" class="cpp">typedef struct {
	nic_t nic;			// network interface where packet arrived
	ip_addr_t destIp;		// destination IP
	ip_addr_t srcIp;		// source IP
	port_t destPort;		// destination port
	port_t srcPort;		// source port 
	protocol_t protocol;	// protocol type (rtp, ftp, http, sip, etc)
	int priority;			// packet priority
} packet_trace_t;
</pre>
<br />NAT table and IP configuration table are stores in tbb::concurrent_hash_map. Packet chunk is stored in std::vector and chunk queue is of type tbb::concurrent_queue:<br /><br />
<pre name="code" class="cpp">typedef concurrent_hash_map&lt;port_t, address*, string_comparator&gt; nat_table_t; 
typedef concurrent_hash_map&lt;ip_addr_t, nic_t, string_comparator&gt; ip_config_t; 
typedef vector&lt;packet_trace_t&gt; packet_chunk_t;
concurrent_queue&lt;packet_chunk_t&gt; chunk_queue;
</pre>
<br />Input file reading is made by separate thread that executes input_function. The input_function opens file and reads it. Reading is performed by chunks that are passed to chunk queue. TBB containers are thread-safe, so main thread can read from the chunk queue at the same time without making additional synchronization manually. Input thread function:<br /><br />
<pre name="code" class="cpp">void input_function(){	
    ifstream in_file (in_file_name);
    if (!in_file) {
        cerr &lt;&lt; "Cannot open input file " &lt;&lt; in_file_name &lt;&lt; "\n";
        exit (1);
    }
	stop_flag = false;	
	
	while(in_file.good()){			
		packet_chunk_t packet_chunk(chunk_size);
								
		for(int i=0; i&lt;chunk_size; i++){
			packet_trace_t packet;
			in_file &gt;&gt; packet;					
			packet_chunk[i] = packet;			
		}
		chunk_queue.push(packet_chunk);			
	}
	stop_flag = true;
}</pre>
<br />
<p><b>Performance measurements</b></p>
<p>The goals of this project were to achieve good performance and scalability by using TBB. For measurements the following setup was used:</p>
<p>CPU: 4 processors Intel® Xeon X7460, 2,66 Ghz, 24 physical cores total <br />RAM: 16 GB <br />OS: Microsoft Windows Server® Enterprise 2008 SP2 <br />Workload: input file: 113405 packets (5,1 MB) <br />Measurement tool: Intel® VTune<sup>TM</sup> Amplifier XE 2011 <br />Analysis type: Concurrency with default settings</p>
There were performed two tests: for serial and for parallel versions. Below are summaries from the two analyses. Left is for serial and right is for TBB versions:<br /><br />
<p ><img height="326" width="599" src="http://software.intel.com/file/36538" /></p>
<br />
<p>It's seen that CPU time is similar. This is sum of CPU times of all cores of the system. But elapsed time is very different. This is clock time that the application takes for processing. In serial version it is near the value of overall CPU time. In TBB version it is 19 times less. So the application worked 19 times faster.</p>
CPU usage for serial version:<br /><br />
<p ><img height="265" width="766" src="http://software.intel.com/file/36535" /></p>
<br />CPU usage for TBB version:<br /><br />
<p ><img height="258" width="770" src="http://software.intel.com/file/36536" /></p>
<br /><br />
<p>Average number of utilized cores for TBB version is 20.5 and most of the processing time all 24 were used. This demonstrates that application is scalable enough and can use almost all cores on multi-core system.</p>
Bottom-up view of serial application shows that almost all the time is spent for computing module simulating real workload:<br /><br />
<p ><img height="298" width="856" src="http://software.intel.com/file/36537" /></p>
<br /><br />In TBB version picture is very similar, main hotspot is the same compute_t::do_work method. However it's mostly indicated with green that means good CPU utilization. Also there are more functions in the list because of using TBB constructions:<br /><br />
<p ><img height="424" width="770" src="http://software.intel.com/file/36540" /></p>
<br /><br />
<p>The results provided show good performance results for TBB-based application. However keep in mind the following conditions:</p>
<p>1) There were used Amplifier XE API functions __itt_resume() and __itt_pause() that bound measured area. The result show performance of tbb::parallel_pipeline for TBB version and while loop for serial version. Measurements of overall application work will give a little bit different results.</p>
<p>2) Simulated job was used to utilize CPU. The compute_t class computes algorithm of "N queens" task. Real processing is different.  If there would be not enough job for CPU, file input would consume relatively more time. So in real application scalability and performance gain can be worse.</p>
<p><strong>Conclusion</strong></p>
This sample project shows possibility of using TBB in composing Network packet processing applications and applicability of tbb::pipeline. These approaches can be applied in IP routing switches, telecommunication servers (VoIP telephony, video conferencing), various gateways and proxies, etc.  Like any hardly-loaded application network software can win from enabling multi-threading. And it is simple and effective to use Intel® Threading Building Blocks for managing parallelism in your application.
<div><br /></div>
<div>The full project source code:</div>
<div><a target="_blank" href="http://software.intel.com/file/36623">NetworkRouter.cpp</a></div> ]]></description>
      <link>http://software.intel.com/en-us/articles/network-router-emulator/</link>
      <pubDate>Mon, 23 May 2011 13:00:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/network-router-emulator/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/network-router-emulator/</guid>
      <category>Parallel Programming</category>
      <category>Tools</category>
      <category>Intel Software Network communities</category>
      <category>Intel Software Network communities</category>
    </item>
    <item>
      <title>Threading Challenge 2009 - Problem 6:  Line Segment Intersection Entries</title>
      <description><![CDATA[ <img src="http://software.intel.com/file/15359" /><br /><br />Below you will find many of the entries received for our <strong>6th problem - Line Segment Intersection.</strong>  Please feel free to review and join us in the <a href="http://software.intel.com/en-us/forums/line-segments/"><strong><span >forum</span></strong></a> dedicated to this problem to discuss.<br /><br /><br /><br /><span class="sectionHeading">Winning Submission:</span><strong><span class="sectionHeading">  </span><br /><br /><br />*BradleyKuszmaul:  </strong><a href="http://software.intel.comjavascript:void(0)" onclick="ndownload('http://software.intel.com/file/21625')">Code</a> / <a href="http://software.intel.comjavascript:void(0)" onclick="ndownload('http://software.intel.com/file/21626')">Write-up</a><br /><br /><br /><br /><br /><span class="sectionHeading">Other Submissions (more to be added soon): <br /></span><br /><br /><strong>*akki:</strong>  <a href="http://software.intel.comjavascript:void(0)" onclick="ndownload('http://software.intel.com/file/21627')">Code</a> / <a href="http://software.intel.comjavascript:void(0)" onclick="ndownload('http://software.intel.com/file/21628')">Write-up</a><br /><br /><strong>*denghui0815:</strong>  <a>Code</a> / <a href="http://software.intel.com/file/22472">Write-up</a> (Mandarin)<br /><br /><strong>*Dmitriy Vyukov:</strong>  <a href="http://software.intel.com/file/22473">Code</a> / <a href="http://software.intel.com/file/22470">Write-up</a><br /><br /><strong>*mikhailsemenov:</strong>  <a href="http://software.intel.com/file/22474">Code</a> / <a href="http://software.intel.com/file/22471">Write-up</a> ]]></description>
      <link>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-6-line-segment-intersection-entries/</link>
      <pubDate>Thu, 24 Sep 2009 00:00:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-6-line-segment-intersection-entries/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/threading-challenge-2009-problem-6-line-segment-intersection-entries/</guid>
      <category>Parallel Programming</category>
      <category>Intel Software Network communities</category>
    </item>
    <item>
      <title>Threading Challenge 2009 - Problem 5:  Knapsack Entries</title>
      <description><![CDATA[ <img src="http://software.intel.com/file/15359" /><br /><br />Below you will find many of the entries received for our <strong>5th problem - Knapsack.</strong>  Please feel free to review and join us in the <a href="http://software.intel.com/en-us/forums/knapsack-problem/"><strong><span >forum</span></strong></a> dedicated to this problem to discuss.<br /><br /><br /><br /><span class="sectionHeading">Winning Submission:</span><strong><span class="sectionHeading"> </span> <br /><br /><br /><span class="sectionHeadingText">*matteocilk:  </span></strong><a href="http://software.intel.com/file/21779" class="sectionHeadingText">Code / </a><a href="http://software.intel.com/file/21780" class="sectionHeadingText">Write-up</a><br /><br /><br /><br /><span class="sectionHeading">Other Submissions: </span><br /><br /><br /><strong>*denghui0815:</strong>  <a href="http://software.intel.com/file/22468">Code</a> / <a href="http://software.intel.com/file/22465">Write-up</a> (Mandarin)<br /><br /><strong>*haojn:</strong>  <a href="http://software.intel.com/file/22467">Code</a> / <a href="http://software.intel.com/file/22464">Write-up</a><br /><br /><strong>*Dmitriy Vyukov:</strong>  <a href="http://software.intel.com/file/22469">Code</a> / <a href="http://software.intel.com/file/22466">Write-up</a> ]]></description>
      <link>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-5-knapsack-entries/</link>
      <pubDate>Thu, 24 Sep 2009 00:00:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-5-knapsack-entries/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/threading-challenge-2009-problem-5-knapsack-entries/</guid>
      <category>Parallel Programming</category>
      <category>Intel Software Network communities</category>
    </item>
    <item>
      <title>Threading Challenge 2009 - Problem 4:  String Matching Entries</title>
      <description><![CDATA[ <img src="http://software.intel.com/file/15359" /><br /><br />Below you will find many of the entries received for our <strong>4th problem - String Matching.</strong>  Please feel free to review and join us in the <a href="http://software.intel.com/en-us/forums/string-matching/"><strong><span >forum</span></strong></a> dedicated to this problem to discuss.<br /><br /><br /><br /><span class="sectionHeading">Winning Submission:</span><strong>  <br /><br /><br /><span class="sectionHeadingText">*BradleyKuszmaul:  </span></strong><span class="sectionHeadingText"> </span><a href="http://software.intel.comjavascript:void(0)" onclick="ndownload('http://software.intel.com/file/21632')" class="sectionHeadingText">Code</a><span class="sectionHeadingText"> / </span><a href="http://software.intel.comjavascript:void(0)" onclick="ndownload('http://software.intel.com/file/21631')" class="sectionHeadingText">Write-up</a><br /><br /><br /><br /><span class="sectionBodyText"><span class="sectionHeading">Other Submissions:<br /></span><br /><br /><strong>*akki:</strong>  <a href="http://software.intel.comjavascript:void(0)" onclick="ndownload('http://software.intel.com/file/21630')">Code</a> / <a href="http://software.intel.comjavascript:void(0)" onclick="ndownload('http://software.intel.com/file/21629')">Write-up</a><br /><br /><strong>*jne100:</strong>  <a href="http://software.intel.com/file/22461">Code</a> / <a href="http://software.intel.com/file/22460">Write-up</a><br /><br /><strong>*haojn:</strong>  <a href="http://software.intel.com/file/22462">Code</a> / <a href="http://software.intel.com/file/22463">Write-up</a><br /><br /><strong>*denghui0815:</strong>  Code / <a href="http://software.intel.com/file/22459">Write-up</a><br /><br /><strong>*Sergii Biloshytskyi:</strong>  Code / Write-up<br /><br /><strong>*javadude:</strong>  Code / Write-up</span> ]]></description>
      <link>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-4-string-matching-entries/</link>
      <pubDate>Thu, 24 Sep 2009 00:00:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-4-string-matching-entries/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/threading-challenge-2009-problem-4-string-matching-entries/</guid>
      <category>Parallel Programming</category>
      <category>Intel Software Network communities</category>
    </item>
    <item>
      <title>Threading Challenge 2009 - Problem 3:  Searching Entries</title>
      <description><![CDATA[ <img src="http://software.intel.com/file/15359" /><br /><br /><span class="sectionBodyText">Below you will find many of the entries received for our <strong>3rd problem - Searching.</strong>  Please feel free to review and join us in the </span><a href="http://software.intel.com/en-us/forums/searching/"><strong class="sectionBodyText"><span >forum</span></strong></a><span class="sectionBodyText"> dedicated to this problem to discuss.</span><br /><br /><br /><span class="sectionHeading">Winning Submission:</span><strong> <br /><br /><br /><span class="sectionHeadingText">*denghui0185:  </span></strong><a href="http://software.intel.com/file/22037" class="sectionHeadingText">Code </a><span class="sectionHeadingText">/ </span><a href="http://software.intel.com/file/21783" class="sectionHeadingText">Write-up Mandarin</a><br /><br /><br /><span class="sectionBodyText"><span class="sectionHeading">Other Submissions:<br /></span><br /><br /><strong>*akki: </strong><a href="http://software.intel.com/file/21781">Code</a> / <a href="http://software.intel.com/file/21782">Write-up</a><br /><br />*guzheng2000:  Code / Write-up<br /><br />*andreyryabov:  Code / Write-up<br /><br />*pfrey:  Code / Write-up<br /><br />*calebe:  Code / Write-up<br /><br />*hpc_2002:  Code / Write-up<br /><br />*Dmitriy Vyukov:  Code / Write-up<br /><br />*jne100:  Code / Write-up<br /><br />*dweeberlyloom:  Code / Write-up<br /><br />*iarchitect:  Code / Write-up</span> ]]></description>
      <link>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-3-searching-entries/</link>
      <pubDate>Tue, 22 Sep 2009 00:00:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-3-searching-entries/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/threading-challenge-2009-problem-3-searching-entries/</guid>
      <category>Parallel Programming</category>
      <category>Intel Software Network communities</category>
    </item>
    <item>
      <title>Threading Challenge 2009 - Problem 1: Radix Sort Entries</title>
      <description><![CDATA[ <p><img src="http://software.intel.com/file/15359" /><br /><br />Below you will find many of the entries received for our <strong>1st problem - Radix Sort</strong>.  Please feel free to review and join us in the <a href="http://software.intel.com/en-us/forums/radix-sort/"><strong><span >forum</span></strong></a> dedicated to this problem to discuss.<br /><br /><br /><span class="sectionHeading">Winning Submission:</span>  <br /><br /><strong class="sectionHeadingText">*denghui0185:  </strong><a href="http://software.intel.com/file/22036" class="sectionHeadingText">Code</a><span class="sectionHeadingText"> / </span><a href="http://software.intel.com/file/21777" class="sectionHeadingText">English Write-up</a><span class="sectionHeadingText"> /</span><a href="http://software.intel.com/file/21776" class="sectionHeadingText"> Mandarin Write-up</a><br /><br /><br /><span class="sectionHeading">Other Submissions: </span><br /><br /><strong>*akki:  </strong><a href="http://software.intel.com/file/21775">Code</a> / <a href="http://software.intel.com/file/21774">Write-up</a><br /><br /><strong>*ikipou:</strong>  <a href="http://software.intel.com/file/22388">Code</a> / <a href="http://software.intel.com/file/22389">Write-up</a><br /><br /><strong>*jne100:</strong>  <a href="http://software.intel.com/file/22390">Code</a> / <a href="http://software.intel.com/file/22391">Write-up</a><br /><br />*andreyryabov:  <a href='http://software.intel.com/file/22395'>Code</> / <a href='http://software.intel.com/file/22396'>Write-up</a><br /><br />*Dmitriy Vyukov:  <a href='http://software.intel.com/file/22397'>Code</a> / <a href='http://software.intel.com/file/22398'>Write-up</a><br /><br />*pfrey:  Code / Write-up<br /><br />*licstar:  Code / Write-up<br /><br />*emacswu:  Code / Write-up<br /><br />*dweeberlyloom:  Code / Write-up<br /><br />*hoajn:  Code / Write-up<br /><br />*nickbes:  Code / Write-up<br /><br />*adrcto:  Code / Write-up<br /><br />*m_kirov:  Code / Write-up</p> ]]></description>
      <link>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-1-radix-sort-entries/</link>
      <pubDate>Mon, 21 Sep 2009 11:30:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-1-radix-sort-entries/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/threading-challenge-2009-problem-1-radix-sort-entries/</guid>
      <category>Parallel Programming</category>
      <category>Intel Software Network communities</category>
    </item>
  </channel></rss>
