<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated on Fri, 25 May 2012 08:59:47 -0700 -->
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <atom:link href="http://software.intel.com/en-us/articles/partner-program/type/technical-article/feed/" rel="self" type="application/rss+xml" />
    <title>Intel Software Network articles Feed</title>
    <link>http://software.intel.com/en-us/articles/partner-program/type/technical-article/</link>
    <description></description>
    <language>en-us</language>
    <item>
      <title>Designing Application Software for Energy-efficient Performance</title>
      <description><![CDATA[ <b>By Nancy Nicolaisen</b><br /><br />Personal computers are designed to be in processor idle 75% of the time but in fact might more realistically be estimated to be idle in excess of 90% of the time because of the effects of imposed waits for user input, server response, and resource availability. An idle processor is available to sleep and, while in a sleep state, can save most of the energy it would otherwise consume from actively executing. At least on the client side, if all of the theoretical energy-saving potential of processor sleep states were realized, end user energy use could shrink by fantastical amounts with no apparent sacrifice of functionality or productivity.<br /><br />This, however, is not today’s status quo. For various reasons, users sometimes intentionally configure client systems not to sleep, and it is not uncommon for application software to inadvertently (or intentionally) prevent CPUs from entering sleep states. Application developers can’t do anything about the former. However, there is a lot they can do to make sure the sophisticated laptop and tablet solutions they design, code, and deploy are energy efficient. In addition, if targeting a thin client, developers have to be aware of the back-end servers and how they could affect the operation and power envelope for that thin client.<br /><br />
<h2 class="sectionHeading">Follow Best Practices for Creating Energy-efficient Client Device Applications</h2>
From an application developer’s point of view, a key tactic for achieving energy-efficient software performance is effective handling of sleep state transitions. A few general rules can go a long way toward accomplishing this goal—for example:<br /><br />
<ul>
<li>Design applications that allow screens to darken and disks to idle by avoiding behaviors that unnecessarily prevent systems from remaining in a sleep state. Moving from sleep states to full activity states requires some energy, thus, develop algorithms to not keep waking idle processors unnecessarily.</li>
<li>Where possible, eliminate code that keeps processors from transitioning to sleep states.</li>
<li>Employ development frameworks that allow an app to be respectful of sleep status and resilient in handling nonessential workloads.</li>
<li>To prevent users from disabling sleep, become more context aware, and take steps to ensure that systems don’t enter sleep states when users are passively interacting with them (e.g., watching or listening).</li>
<li>Develop power-aware strategies for handling timers and looping. Investigate the use of compiler switches that unroll deterministic loops, and make other adjustments that reduce the overall number of instructions executed (e.g., remove polling).</li>
<li>Use energy-aware tools to identify patterns of processor use in your apps.</li>
</ul>
A well-designed app should have little impact on overall energy consumption when it is open but idle, as Figure 1 shows.<br /><br />
<p ><img src="http://software.intel.com/file/43251" /></p>
<br /><b>Figure 1.</b> A key energy management principle: Idle apps should have negligible impact on power use.<br /><br />
<h2 class="sectionHeading">Tools And Techniques for Evaluating and Optimizing Application Energy Consumption Performance</h2>
Unlike many types of optimization, developers can’t see or infer symptoms of poor application energy performance. To make real progress toward improved client-side application energy efficiency, you need to employ power performance optimization tools and techniques. Figure 2 shows the results for 15 applications in a study. The chart shows two things: the average power over baseline (in Watts) and the percentage impact of that power draw over baseline. For example, Instant Messenger-4 running at idle caused the platform power draw to increase to 1.7 Watts, or 21 percent higher than system idle without the application running. This idle power draw affected battery life by approximately 4 hours. The conclusion from this study is that applications within the same category can exhibit different idle power behaviors.<br /><br />
<p ><img src="http://software.intel.com/file/43252" /></p>
<br /><b>Figure 2.</b> Analyzing app power performance behaviors “in the wild” can be complex.<br /><br />Imbuing client-side applications with power awareness isn’t difficult, but it is something that must be done with deliberate intention. For app developers, this is a matter of finding and using frameworks and instrumentation that help validate good designs and discover the flaws in program logic that need remediation.<br /><br /><b><i>Intel® Energy Checker</i></b><br /><br />The Intel® Energy Checker software development kit (SDK) provides developers with a way to analyze how applications consume power. This information is key to optimization, because gross power usage is far from being the whole story. Real efficiency demands an understanding of exactly how an app’s power consumption relates to its work output. For example, power sinks can be the result of poorly integrated legacy code, duplication of effort in libraries and components, frivolous output activities, and the like.<br /><br />Finding app behaviors that waste energy can be as challenging as finding memory leaks and other sublethal application flaws. Symptoms can be so subtle that it’s impossible to diagnose problems without instrumented code and a controlled, self-documenting test environment. Fortunately, this is precisely what Intel® Energy Checker provides. This SDK allows developers to:<br /><br />
<ul>
<li>Evaluate app productivity versus power consumption</li>
<li>Instrument code to report specific metrics about operations performed, timings, and collateral conditions</li>
<li>Generate large performance data sets using a variety of execution regimes</li>
<li>Evaluate the power consumption impacts of alternative libraries, drivers, and frameworks</li>
<li>Validate optimizations and remediation</li>
<li>Instrument apps in ways that allow customers and third-party testers to certify apps as energy efficient</li>
</ul>
<i><b>What Intel® Energy Checker Offers Client App Developers</b></i><br /><br />The Intel® Energy Checker SDK is a full-featured testing and validation facility. Its fundamental layer comprises a counter application programming interface (API) that allows direct measurement of app productivity. The ability to export and import counters provides a mechanism for analyzing how efficiently apps work with one another and the system overall.<br /><br />Intel® Energy Checker’s companion build and scripting tools allow a means of analyzing code for which source is not available or can’t practically be built with inline instrumentation. Command-line utilities allow Intel® Energy Checker tools and data streams to interoperate with native Windows* and Linux* counters and utilities, making Oracle* Solaris 10–, Mac OS X*–, and Linux* MeeGo-based apps susceptible to evaluation by Intel® Energy Checker testing and validation.<br /><br />One of the biggest advantages the Intel® Energy Checker toolset offers is its support for a broad variety of application development regimes. To help developers get up to speed with their projects, the SDK shipped with sample applets demonstrating how to employ it in the following situations:<br /><br />
<ul>
<li>With threading</li>
<li>Called from Java*</li>
<li>Called from C#</li>
<li>Called from Objective-C</li>
<li>With Linux system information utilities</li>
<li>CPU use histogram generator tools</li>
<li>Cluster energy efficiency</li>
<li>PL sampling measurements</li>
</ul>
The suite supports a majority of the common application programming languages in use today, including C, C++, C#, Objective-C, Java*, PHP, and Perl.<br /><br /><i><b>Using Microsoft Joulemeter to Analyze Energy Efficiency Performance</b></i><br /><br />Joulemeter from Microsoft* Research is focused on creating modeling and optimization tools to assist system architects, administrators, and developers in improving the energy efficiency of computing infrastructures. The central concentrations of the Joulemeter Research Program are on modeling and optimizing power use by computational infrastructure of all types and scales. This information is critical, because to achieve real energy savings, systems have to be optimized from end to end. Even lightweight mobile clients have to be aware of the impacts of their behavior on back-end servers, such as whether they will affect the operation’s overall power performance.<br /><br />The Joulemeter Research Project has published the lightweight stand-alone Joulemeter application* for Windows* 7 laptops and desktops. The app estimates the power consumption of a single computer by tracking resource usage (CPU saturation, screen backlighting, antenna power use, and the like); from these measurements, Joulemeter forecasts system power consumption.<br /><br /><i><b>Intel® Battery Life Analyzer</b></i><br /><br />The Intel® Battery Life Analyzer (BLA) is a lightweight tool that monitors battery life on computers running the Windows* operating system. Empirically evaluating energy-related application performance on battery-powered systems can sometimes yield impressive gains with relatively minor changes in application code. BLA helps developers identify opportunities to create “application idle” state converge on platform idle states. In particular, BLA gets around a problem from which most power management and accounting application programming interfaces (APIs) suffer. Inherently, accounting APIs have to work with sampled data, recorded at timer tick intervals (on the order of 15.6 msec). Therefore, if a software operation starts on a timer tick but ends before the next tick, it can’t be detected by metrics that use full tick granularity.<br /><br />Although this sounds like a negligible shortcoming, it isn’t. Many isochronous operations (think media handling) fall into this category, and such operations can easily become huge fractions of a platform’s overall workload. In contrast, BLA uses fine-grained process information based on microsecond scale time stamps. BLA records both a given activity’s starts and stops. This precision provides not only a more accurate picture of power utilization; it is also a far more complete one. (For a rigorous treatment of this topic, you can find a link to the Intel white paper, “Energy Efficient Platforms—Considerations for Application Software and Services,” in the Helpful Links section.)<br /><br />
<h2 class="sectionHeading">Mobile Device Battery Life Conservation</h2>
More and more, batteries are a key source of power for computing platforms. In early 2011, smart phones outsold PCs 4 to 1 worldwide. Given this, expect to see the energy efficiency of mobile apps become a key concern for all types of software consumers. Fortunately, mobile developers are generally pretty savvy about energy efficiency, as battery-operated devices have always demanded that discipline of them.<br /><br />All mobile development frameworks include methods for detecting power states (connected to AC wall current or running on DC battery power), testing battery levels, and scaling system and application behaviors in response to energy regimes. Apple*, Symbian*, Microsoft*, RIM*, and other mobile device vendors have worked over the years to establish general guidelines that help app developers be good power-management citizens on small devices. Many of these rules translate easily to laptop and desktop apps that are being reworked to improve power performance:<br /><br />
<ul>
<li>Replace timer-based designs with event-driven or interrupt-driven logic.</li>
<li>Avoid using timers as a high-resolution time source. If there is no workable alternative, ensure that timer resolution is reset to the system default when it is not actively engaged in its specific task.</li>
<li>Apps designed to provide passive display of content should explicitly increase display dimming timeout to accommodate playback using power request or availability APIs. The requests should be explicitly rescinded when the app is minimized or inactive.</li>
<li>Screen savers and the like should not alter dimming timeouts. Unless there is an aesthetic reason for them, screen savers do nothing to maintain the health of LCD monitors and are simply wasting energy. Let screens dim, if practical.</li>
</ul>
Ineffective management of sleep states can dramatically multiply an app’s power consumption. Effective use of parallelization, coalescing tasks that are difficult to parallelize in a single thread, and avoidance of excessive requirement for synchronization among threads are all strategies that can help reduce the number of sleep state transitions an app triggers (see Figure 3).<br /><br />
<p ><img src="http://software.intel.com/file/43253" /></p>
<br /><b>Figure 3.</b> Effective management of sleep states is key to good app energy performance.<br /><br />
<h2 class="sectionHeading">Conclusion</h2>
Managing the energy performance of application software may reasonably be expected to become a core competency for developers in the fairly near term, as economic and environmental considerations shape thinking on software engineering best practices. Many good tools exist for this purpose, and the Intel® Energy Checker SDK can help to validate and refine energy-optimization efforts of client software developers targeting both the desktop and mobile platforms.<br /><br />
<h2 class="sectionHeading">Helpful Links and Additional information on Power Management Tools and Resources</h2>
<ul>
<li><a target="_blank" href="http://www.climatesaverscomputing.org/resources/information/software-development">Software development information from Climate Savers Computing</a>*</li>
<li><a href="http://software.intel.com/en-us/articles/intel-energy-checker-sdk/#FAQ">Intel® Energy Checker SDK and user guide</a></li>
<li><a target="_blank" href="http://www.thegreengrid.org/about-the-green-grid.aspx">Learn more about The Green Grid</a>*</li>
<li><a target="_blank" href="http://msdn.microsoft.com/en-us/library/windows/desktop/aa373163(v=vs.85).aspx">Microsoft Power Management Functions* reference</a></li>
<li><a target="_blank" href="http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/Power_Management_Guide/index.html">Red Hat Linux 6 Power Management Guide</a>*</li>
<li><a target="_blank" href="http://www.elinux.org/Power_Management">Power Management for Linux</a>*</li>
<li>Fine-Grained Energy Profiling for Power-Aware Application Design: <a target="_blank" href="http://research.microsoft.com/apps/pubs/default.aspx?id=73662">http://research.microsoft.com/apps/pubs/default.aspx?id=73662</a>*</li>
<li>Intel white paper: “Energy Efficient Platforms—Considerations for Application Software and Services” (<a href="http://www.intel.com/content/www/us/en/green-it/energy-efficiency/energy-efficient-platforms-2011-white-paper.html?wapkw=considerations+for+application+software+and+services">http://www.intel.com/content/www/us/en/green-it/energy-efficiency/energy-efficient-platforms-2011-white-paper.html?wapkw=considerations+for+application+software+and+services</a>)</li>
<li>BLA requests, questions, and feedback: <a href="http://software.intel.commailto:BatteryLifeAnalyzer@intel.com">BatteryLifeAnalyzer@intel.com</a></li>
</ul>
<h2 class="sectionHeading">About the Author</h2>
Nancy Nicolaisen is an author, researcher, and veteran software developer specializing in mobile and embedded device technologies. Her feature articles, columns, and analyses have been internationally circulated in publications such as <i>BYTE, PC Magazine, Windows Sources, Computer Shopper, Dr. Dobbs Journal of Software Engineering, and Microsoft Systems Journal</i>. She is the author of three books—<i>Making Windows Portable: Porting Win32 to Win CE</i> (2002, John Wiley &amp; Sons); <i>The Practical Guide to Debugging 32 Bit Windows Applications</i> (1996, McGraw Hill); and <i>The Visual Guide to Visual C++</i> (1994, Ventana Press)—available in five foreign-language editions. In 2007, she served as technical advisor for the development of the Microsoft Professional Education course “Designing, Building and Managing Wireless Networks.” Ms. Nicolaisen is currently active in exploring open source technologies and trends for mobile, embedded, and wireless devices.<br /><br /> ]]></description>
      <link>http://software.intel.com/en-us/articles/designing-application-software-for-energy-efficient-performance/</link>
      <pubDate>Mon, 09 Apr 2012 00:00:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/designing-application-software-for-energy-efficient-performance/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/designing-application-software-for-energy-efficient-performance/</guid>
      <category>Parallel Programming</category>
      <category>Tools</category>
      <category>Intel® AppUp(SM) Developer Community</category>
      <category>Intel SW Partner program</category>
      <category>Intel Software Network communities</category>
      <category>Power Efficiency</category>
      <category>Ultrabook</category>
      <category>Server Developer Community</category>
    </item>
    <item>
      <title>Using Intel® Power Checker to measure the energy performance of a compute-intensive application </title>
      <description><![CDATA[ <p>Intel® Power Checker provides developers with a quick and easy way to evaluate the idle power efficiency of their applications on mobile platforms with Intel® Core™ processor or Intel® Atom™ technology running the Microsoft Windows* operating system. Any compiled language application, especially those designed to run on technology based on Intel® products and Java Framework applications can be analyzed by Intel Power Checker. The checker can be used with or without a supported external power meter.</p>
<p>The Intel Power Checker 2.0 now supports measurement both on battery and with the system plugged into an external AC power source. External power measurement is only supported on Intel® Second Generation Core processors and if the Intel® Power Gadget software has been installed.</p>
<p>For this article, I took a very compute-intensive parallel application that I wrote to solve instances of the logic puzzle Akari. The code uses a backtracking algorithm to explore how to place light bulbs onto a grid under constraints dictated by the rules of the puzzle and the layout of the puzzle instance. Potentially millions of independent tasks can be generated by the code as the solution space is searched by threads executing those tasks. This solution method is eminently scalable to a large number of threads and is able to keep many cores running at peak speed for a sustained amount of time.</p>
<h2>How to Use Intel Power Checker</h2>
<p>The Intel Power Checker provides a GUI wizard that leads you through the four steps of power analysis. These four steps in the checker are described below. Before starting the assessment, be sure to know which section of your application (a workload) you want to be measured, as the Power Checker will only measure a 30 second execution interval. (If you want to measure the entire execution workload, you should try some other tool, like Intel Power Gadget.) Your workload could be a compute-intensive portion or an I/O-intense section or just some point in execution that typifies the majority of expected usage.</p>
<h3 >Step 1: Specifying the Power Meter device</h3>
<p>If you have an external power meter attached to your test system, you can select the model being used on the first screen of the wizard. The default is that no external device is being used. For this default case, Intel Power Checker will determine if the system is capable of providing power consumption data and if the correct power driver, EzPwr.sys, is installed. (The driver is part of the default installation of <a href="http://software.intel.com/en-us/articles/intel-power-gadget/">Intel Power Gadget</a>.)</p>
<h3 >Step 2: Measure System Baseline</h3>
<p>The first measurement that the Intel Power Checker will perform is on the next screen within the wizard. This is to measure the baseline power consumption of the hardware without your application running. Prior to this measurement phase any unnecessary processes such as operating system updates, Windows Indexing Service, virus scans, media players, and internet browsers should have been shut down. In other words, to get the most accurate results you should make your test system as idle as possible and ensure that nothing will become a foreground process during your measurement runs.</p>
<p>Once you have a quiescent system, click the “Start” button to begin this phase of the testing. The Intel Power Checker waits 15 seconds to allow the system to come to an idle state before starting the measurements. You need to be sure to position your mouse and the keyboard out of reach, or keep your hands away from them, to avoid any stray contact that might trigger some response from the platform. After the pause, the checker will observe the system for 30 seconds in this idle state. A progress bar will show the time remaining in each part of this phase. Once the baseline data collection is complete, click the “Next” button to proceed to the next phase.</p>
<h3 >Step 3: Measure Active Application</h3>
<p>Before you are taken to the next screen in the wizard, you are instructed to start the application you are interested in measuring. Start up your application and click the “OK” button to advance the GUI to the next screen. Once you have reached the Step 3 screen, use the scroll bar to locate your application in the process list and click on that line to select it. If your application is not listed, click the “Refresh List” button so that your application’s process will be available to select. In addition, you can use the “Apply Filter” button to narrow down the list in order to find your application’s process quickly. .After selecting your application from the list, click “Next” to move on to the data collection for this phase. Before starting the assessment, be sure your application has reached the desired point of measurement. If there are some initial setup computations that are not of interest, you will need to get past this point before letting Intel Power Checker begin measurement. For my Akari application, there is very little setup time. It was typically in the thick of computation by the time I had gotten to the point of selecting the process from the list.</p>
<p>As soon as I could, I clicked the “Start” button to begin capturing measurement data. Since this is one of the crucial power measurements for your application, always begin capturing data <b>after</b> the workload or critical section has begun and make sure this active execution will run longer than the 30 seconds needed to complete the measurement time.</p>
<h3 >Step 4: Measure Idle Application</h3>
<p>The final phase is to measure your application’s idle power consumption. This is another important phase of energy efficiency measurement of an application since your application must not only do efficient computation, but also not waste energy when sitting idle.</p>
<p>This step doesn’t make much sense within my compute-intensive application since there is no idle state of the application. Once you start the application on a given puzzle instance, it simply computes all legal solutions in parallel and then ends. As (multiple) solutions are found, they are printed out by the thread that found it. If there are no solutions, a message is printed just before the application terminates. This latter case describes the workload I used for my tests. Because you must have your application running in “idle” mode for this step, I left the application running at full speed and simply allowed Power Checker to take its measurements.</p>
<p>If your application does have an idle state, perhaps waiting for interaction from the user, the checker will give the system 15 seconds to calm down fully before taking a final 30 second measurement.</p>
<p>Upon completion of this last data collection phase, you will be able to proceed to the results screen within the Intel Power Checker wizard. After all three measurement phases have been completed; a Tool Report File will be generated containing all of the results for later analysis.</p>
<h3 >What data is presented</h3>
<p>The View Results screen of the Intel Power Checker wizard provides basic information about the software assessment. The type of processor in your system and the type and model of the power source that was used are given. Four numerical values for each of the three measurement phases are presented. These values are:</p>
<ul >
<li><b>Elapsed Time:</b> The exact number of seconds that each of the phases lasted.</li>
<li><b>Energy Consumption:</b> The rate that the battery was discharged during each of the three phases.</li>
<li><b>Average C3 State Residency:</b> The percentage of time that the system was in the C3 state during the data collection period.</li>
<li><b>Platform Timer Period:</b> The number of milliseconds that the platform timer collected</li>
</ul>
<p><img src="http://software.intel.com/file/42410" /></p>
<p>Typical results would hopefully show a larger percentage of time spent in the C3 State Residency for the application idle time measurement (the middle of the three columns on the View Results screen). As my puzzle solving application was still computing as much as it did in the active execution measurement step, this was not the case for my results. This is atypical for the intended type of applications Intel Power Checker assumes will be measured. Thus, the C3 State Residency values provided by the tool for the idle application were not valid for my particular application.</p>
<p>The name of the report file and the directory to which it will be found are listed on the View Results screen.</p>
<h2>Some Caveats</h2>
<p>Below are some things you should consider before and during a measurement run using Intel Power Checker.</p>
<ul >
<li>Before you start using Intel Power Checker, be sure your chosen workload will run for at least 30 seconds from the point you wish to measure power consumption. In my case, I required a data set that would force the application to run for at least 75 seconds (30 for active measurement, 15 for idle setup, and 30 for idle measurement) plus the time I needed to click boxes and find my application in the process list. Since I ran the application on several different numbers of threads, I needed to be sure that the fastest execution time was still large enough to get all the timings steps completed during a Intel Power Checker run.</li>
<li>Upon starting Intel Power Checker, the checker may first report that the platform timer period is invalid. In this case, some currently running (background) process has changed the default and it will be up to the user to determine which currently running application has changed the value. Once you have identified the culprit you must stop this process or service before restarting Intel Power Checker. If you are unsure about which active process is preventing Intel Power Checker from starting, you will need to turn off processes one at a time and try Intel Power Checker until the error message doesn’t come up. </li>
<li>Instructions on the Step 3 screen ask you not to touch the keyboard or mouse. If you are measuring an interactive application or you must interact with the application to generate activity for the full 30 seconds, you will need to touch the keyboard and/or mouse. If possible, a workload that can forego interactivity and still compute for the 30 seconds of measurement time would be best. However, if interaction by the user is part of how the application is utilized, interfacing through peripherals will give you a more accurate measure of the overall energy consumption for typical application usage.</li>
<li>A data file is created during each phase of the Intel Power Checker assessment to hold the current information. If you cancel the assessment in any of the three phases then a data file will not be created for that phase. After all three phases have been completed, a Tool Report File, in XML format, will be generated containing all of the results. You can find the name of the report file and where it is located on the View Results screen.</li>
<li>The “Submit Results” button on the View Results screen is optional and only intended for members of the <a href="http://software.intel.com/partner/overview">Intel® Software Partner Program</a> to submit their measurement results to the program. If you are not a member, do not submit your results. Simply click on the “Close” button after you have examined the results compiled by Intel Power Checker.</li>
</ul>
<h2>Some Results</h2>
<p>The purpose of this article is not to determine the best scenario for running my Akari solver application in the most energy efficient way. You will want to do this for your application, though, and this article has given you the background on Intel Power Checker to determine if this checker can help you quantify the current power consumption of your application. Also, as you make modifications to the application you will be able to determine if those changes improve the energy efficiency or cause your application to suck more power than before.</p>
<p>In addition to the average C3 State Residency percentage, the checker delivers the total number of Joules expended during the 30 seconds of execution time measured. From this I can compute the average Watts for execution parts of the application. I have found that a better metric for comparing different applications or different runs of the same application is milliwatt hours (mWh). You need the total execution time of the execution portion of the application to compute this value. Since Intel Power Checker only measures activity in 30 second segments, you will need to have some timing data available, which I happened to have for the different runs I made of my Akari application.</p>
<p>I found significant differences when running with and without Hyper-Threading Technology (HT) turned on. Also, if the platform was running on battery (DC) power or from the wall socket (AC) power, a difference in execution time and power usage was evident. For example, when running with HT on and a full complement of four threads on the 4 logical cores in my system, I saw the AC power run 1.19X faster that when running the same workload on DC power. However, the former run took 1.15X more power.</p>
<p>Comparing results between runs on DC power versus AC power is a not a good comparison, especially in this case. The power source is detected by the system and the processor is allowed to run with Intel® Turbo Boost Technology at a higher frequency if the platform is using external power. Even so, you may need to be concerned about power consumption of your application in both power source circumstances and you will need to run measurement experiments within each setup to gauge how well your application modifications affect overall power consumption.</p>
<h3 >System Requirements</h3>
<p>You can use Intel Power Checker on a laptop or netbook based on Intel® Core™ processor or Intel® Atom™ processor technology. A desktop with an external power meter or a desktop that is capable of providing the power consumption information can also be analyzed. A Java* Runtime Environment (JRE) (version 6 update 11 or higher) is also required to run the checker. Supported operating systems are Microsoft Windows* XP (Service Pack 3), Microsoft Windows Vista* (Service Pack 2), Microsoft Windows* 7 (Service Pack 1 [32-bit and 64-bit]), and Microsoft Windows* Server 2008 R2.</p>
<h3 >Download link</h3>
<p>To download the Intel Power Checker installation package, go to the following link:</p>
<p><a href="http://software.intel.com/partner/app/software-assessment">http://software.intel.com/partner/app/software-assessment/</a>. Click on the Intel Power Checker tab to move down to the download link.</p>
<h3 >Other supporting links</h3>
<p>There is a video demonstration of using Intel Power Checker, “A Look at Intel Power Checker,” at the link: <a href="http://software.intel.com/en-us/videos/channel/intel-software-partner-program/a-look-at-the-intel-power-checker/1127786023001">http://software.intel.com/en-us/videos/channel/intel-software-partner-program/a-look-at-the-intel-power-checker/1127786023001</a>. Dave Valdovinos and Taylor Kidd, both from Intel, show off the GUI wizard as it measures the power performance of a game-like application.</p> ]]></description>
      <link>http://software.intel.com/en-us/articles/using-intel-power-checker-to-measure-the-energy-performance-of-a-compute-intensive-application/</link>
      <pubDate>Mon, 12 Mar 2012 00:00:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/using-intel-power-checker-to-measure-the-energy-performance-of-a-compute-intensive-application/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/using-intel-power-checker-to-measure-the-energy-performance-of-a-compute-intensive-application/</guid>
      <category>Mobility</category>
      <category>Parallel Programming</category>
      <category>Intel® AppUp(SM) Developer Community</category>
      <category>Intel Software Network communities</category>
      <category>Intel SW Partner program</category>
      <category>Intel Software Network communities</category>
      <category>Game Development</category>
      <category>Power Efficiency</category>
      <category>Intel® vPro™ Developer Community</category>
      <category>Resources For Software Developers</category>
      <category>Ultrabook</category>
      <category>Server Developer Community</category>
    </item>
    <item>
      <title>Ultrabook™ and the Intel® Energy Checker SDK</title>
      <description><![CDATA[ <h2 class="sectionHeading">Abstract</h2>
With the advent of the Ultrabook™<sup>1</sup>, the demand for applications that are power misers continues to rise. The Intel® Energy Checker SDK can be used to instrument an application and collect data to help a developer pinpoint power hungry features that can be optimized for power. This article gives an overview of the Intel Energy Checker SDK and discusses how it can be used to advantage when improving energy usage on an Ultrabook.<br /><br />
<h2 class="sectionHeading">More Work, Less Power</h2>
An Ultrabook™ needs to budget its power consumption very carefully to extend usefulness while running on battery. Therefore, applications that use less energy are preferred. Often, application developers create their program on a desktop system where power/energy consumption is less important than raw performance. Not only should applications be developed to conserve power when active, they should also be developed to minimize energy usage during program idle periods, this is often overlooked and can greatly extend battery life. If power issues are ignored, running a program on an Ultrabook will result in unpleasant surprises for the user. If developers test their application on an Ultrabook system during development, they will gain insight into how well the program runs in a power limited environment. An analysis tool such as the <a href="http://software.intel.com/en-us/articles/intel-energy-checker-sdk/">Intel® Energy Checker SDK</a> can be a powerful companion during the optimization phase for software designed for an Ultrabook.<br /><br />
<h2 class="sectionHeading">Energy Efficency</h2>
Before explaining what Intel Energy Checker SDK contains, a discussion on Energy Efficiency (EE) is in order. This is a term that is used extensively in the Intel Energy Checker SDK. There is no universally accepted definition of EE, so for the purposes of this tool it is defined as:<br />
<p ><em>EE=Work/Energy</em></p>
<em>Work</em> is defined as the amount of “<em>useful work</em>” done by a software application. There is no concise, easy definition of the term <em>useful work</em> either, as what is considered <em>useful work</em> in one program may be quite different in another application. The developer is required to make that determination. For example, one might consider the areas of a movie player program where it provides the customer value (such as decoding the movie) as useful work whereas areas of the program that are accessing resources, waiting on input, or performing synchronization would not.<br /><br />
<h2 class="sectionHeading">Code Instrumentation</h2>
The first step in using Intel Energy Checker SDK to help determine an application’s EE is to create and use “counters” in the software to determine quantities of “useful work”. A counter is defined as a 64-bit (8 byte) variable that keeps a running total of how many times a particular event occurs. In the “C” language, this becomes an unsigned long long data type. A developer can create one or more counters during the initialization portion of the software. Next, a container for the counters can be created, called a “Productivity Link” (PL)<sup>2</sup>. Each PL holds up to 512 counters, and up to 10 different PL’s can be open at one time, but most software will require far smaller numbers of counters and PL’s.<br /><br />During the application runtime, values can be written to any counter in the PL, based on the developer’s requirements. Intel Energy Checker SDK can collect the information from the PL’s in order to determine how much work was done.<br /><br />
<h2 class="sectionHeading">Energy Consumed</h2>
The second part of finding the EE of a software application is to measure how much energy was consumed while the program was running. To do this, Intel Energy Checker SDK uses two tools which are included in the SDK download: Energy Server (ESRV) and Temperature Server (TSRV). ESRV is used to monitor energy and power consumption as reported by external power tools while TSRV monitors temperature related information as reported by environmental probes. ESRV and TSRV counters can be accessed by any program using the Intel Energy Checker API. In addition to the counters created by the developer to determine quantities of work, the developer will want to add counters to collect information from ESRV and possibly TSRV. There are three different ways to set up ESRV:<br /><br /><ol>
<li>Use a power meter to collect actual “platform energy and power” information.<br /><br />There are several different power meters that work with the Intel Energy Checker SDK. Please consult the <em>Intel® Energy Checker SDK User Guide</em> included in the download or found on the <a href="http://software.intel.com/en-us/articles/intel-energy-checker-sdk/">Intel® Energy Checker SDK page</a> to determine which power meters will work and how they should be attached to the test system.<br /></li>
<li>Use <a href="http://software.intel.com/en-us/articles/intel-power-gadget/">Intel® Power Gadget</a> to collect “processor energy and power” usage information on 2nd Generation Intel Core™ processor family. External power meters can also be used which report platform power together with Intel Power Gadget that provides processor power.The blog Accessing Intel® Power Gadget From Intel® Energy Checker SDK by Intel engineer Jun De Vega discusses how to enable Intel® Power Gadget with Intel® Energy Checker.<br /></li>
<li>Choose to use the simulation method which will use the CPU utilization percentage returned from the OS. This method does not require a hardware probe. The Intel Energy Checker SDK offers this method as an option for all processors (rather than just the 2nd Generation Intel Core processor family as with the Intel Power Gadget) in order for enable the user who does not have a power meter. Included in the SDK is a support library for accessing this metric.</li>
</ol>
<p ><img src="http://software.intel.com/file/41168" /><br /><br /><strong>Figure 1:</strong> Conceptualized drawing of Intel Energy Checker setup with Instrumented Application, Power Meter and Environmental probes attached</p>
<h2 class="sectionHeading">Intel Energy Checker Extras</h2>
There are two companion tools that are bundled with the Intel Energy Checker SDK in addition to those already mentioned. The PL GUI Monitor is a user interface that displays Productivity Link (PL) counters in a running program that has already been instrumented with the Intel Energy Checker API. The PL CSV Logger<sup>3</sup> is an application that can collect and write PL counters to a CSV file for later analysis in a variety of spreadsheet applications.<br /><br />Included with the Intel Energy Checker SDK is the <em>Intel® Energy Checker SDK Companion Application User Guide</em> that discusses the features and capabilities of both of these tools.<br /><br />
<p ><img src="http://software.intel.com/file/41169" /><br /><br /><strong>Figure 2:</strong> PL GUI Monitor running while a picture is being rendered</p>
The entire Intel Energy Checker SDK includes other build, scripting, interoperability, and monitoring tools to help developers instrument code and collect energy metrics.<br /><br />A white paper entitled “<em>How Green Is Your Software?</em>” is available for download from the SDK site. This paper discusses approaches for making software power efficient. Look for it in the “Code, Resources and Documentation” section of the <a href="http://software.intel.com/en-us/articles/intel-energy-checker-sdk/">Intel Energy Checker SDK page</a>. Several blogs about Intel Energy Checker that were written by Intel Engineer Jamel Tayeb will also be helpful:<br /><br /><a href="http://software.intel.com/en-us/blogs/2010/04/15/using-the-intel-energy-checker-sdk-at-home/?wapkw=(Energy+Checker)">Using the Intel® Energy Checker SDK at Home</a><br /><br /><a href="http://software.intel.com/en-us/blogs/2010/02/19/creating-a-simple-device-library-for-intel-energy-checker-sdk/?wapkw=(Energy+Checker)">Creating a Simple Device Library for Intel® Energy Checker SDK</a><br /><br /><a href="http://software.intel.com/en-us/blogs/2010/03/30/measuring-the-energy-consumed-by-a-command-using-the-intel-energy-checker-sdk/?wapkw=(Energy+Checker)">Measuring the energy consumed by a command using the Intel® Energy Checker SDK</a><br /><br />All of these resources allow a developer to get started in gathering helpful information.<br /><br />
<h2 class="sectionHeading">Optimizing Applications for Ultrabooks</h2>
Once a program has been instrumented to collect counter information and an energy collection plan is in place (either simulation or power meter), the setup is complete. The developer will then be able to gather information about the application’s energy usage profile and to incorporate optimizations to improve results.<br /><br />There are several areas of optimization the Ultrabook developer can select for improvements:<br /><br />
<div >Consider modifying the application to be aware of the power status and changing usage to reduce energy consumption when the system is on battery.<br /><br />Check the hardware and software system power management possibilities to choose a balanced power setting. This could be a recommended setting suggested in application documentation.<br /><br />Reduce power usage while the application is actively running or doing work. Compute intensive parts of the program will likely benefit from multi-threading and vectorization techniques.<br /><br />Reduce power usage while the application is idle. Being able to minimize the timer tick rate or setting up periodic actions to happen within the same wakeup period are examples of how to reduce idle application power usage.</div>
<br /><br />
<h2 class="sectionHeading">Summary</h2>
With the growth of Ultrabook devices, it will benefit program designers and developers to take a look at ways to save energy while providing a great user experience on an Ultrabook. Intel Energy Checker SDK can provide the means to identify the key areas of focus and confirm the positive results achieved after optimization. Long live Ultrabook!<br /><br />
<h2 class="sectionHeading">About the Author</h2>
<img src="http://software.intel.com/file/41170"  /> Judy Hartley is a Software Applications Engineer who has been working in the Software and Services Group since 2005. She has contributed to many software products and written about her experiences through blogs and whitepapers. Recently Judy has been working on Graphics and Power tools and training for future Intel processors.<br /><br  />
<hr />
<br /><sup>1</sup> Ultrabook is a trademark of Intel Corporation in the U.S. and/or other countries.<br /><br /><sup>2</sup> A Productivity Link is a term used by Intel Energy Checker to represent an arbitrary or logical collection of counters.<br /><br /><sup>3</sup> CSV is the acronym for Comma Separated Values.<br /><br /> ]]></description>
      <link>http://software.intel.com/en-us/articles/ultrabook-and-the-intel-energy-checker-sdk/</link>
      <pubDate>Tue, 24 Jan 2012 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/ultrabook-and-the-intel-energy-checker-sdk/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/ultrabook-and-the-intel-energy-checker-sdk/</guid>
      <category>Mobility</category>
      <category>What If Experimental Software</category>
      <category>Tools</category>
      <category>Intel Software Network communities</category>
      <category>Intel SW Partner program</category>
      <category>Code &amp; Downloads</category>
      <category>Power Efficiency</category>
      <category>Resources For Software Developers</category>
      <category>Ultrabook</category>
    </item>
    <item>
      <title>Compiler settings for memory error analysis in Intel® Inspector XE*</title>
      <description><![CDATA[ <p><b>Introduction:</b><br />Memory error analysis in Intel® Inspector XE and/or Intel® Parallel Inspector can analyze most native binaries. However, some settings make analysis easier.</p>
<p>For the purposes of this article - when I refer to Intel Inspector XE - I am referring to the memory error analysis within Intel Inspector XE and/or Intel Parallel Inspector</p>
<p><b>Useful Settings for memory error analysis in Intel® Inspector XE:</b></p>
<table cellpadding="0" cellspacing="0" border="1">
<tbody>
<tr>
<td width="90" valign="top">
<p><b>Linux* Switch</b></p>
</td>
<td width="90" valign="top"><strong>Windows* Switch</strong></td>
<td valign="top">
<p><b>Purpose</b></p>
</td>
</tr>
<tr>
<td width="90" valign="top">
<p>-g</p>
</td>
<td width="90" valign="top">
<p>/Zi, /ZI</p>
</td>
<td valign="top">
<p>Highly Recommended.</p>
<p>Intel Inspector XE uses the symbols to associate addresses to source lines.</p>
<p>Additionally using this setting is one of the ways in which memory error analysis filters out false positives.</p>
</td>
</tr>
</tbody>
</table>
<p><b></b></p>
<p><b>Settings which impact memory error analysis in Intel Inspector XE:</b></p>
<table cellpadding="0" cellspacing="0" border="1">
<tbody>
<tr>
<td width="90" valign="top"><strong>Linux Switch</strong></td>
<td width="90" valign="top"><strong>Windows Switch</strong></td>
<td valign="top"><strong>Purpose</strong></td>
</tr>
<tr>
<td width="90" valign="top">
<p>-O0</p>
</td>
<td width="90" valign="top">/Od</td>
<td valign="top">
<p>Recommended for Initial analysis.</p>
<p>Allows Intel Inspector XE to more easily associate errors to the correct source line.</p>
<p>Intel Inspector XE can also analyze optimized binaries, but it is difficult to pinpoint the source code location causing a problem in optimized assembly that does not have specific source lines.</p>
<p>Note: While it is easier to do analysis of binaries built with -O0, it is also important to check for memory errors in your <b>"released"</b> (not -O0) version of your binaries.</p>
</td>
</tr>
</tbody>
</table>
<p><b></b></p>
<p><b>Settings not recommended for use with memory error analysis in Intel® Inspector XE:</b></p>
<table cellpadding="0" cellspacing="0" border="1">
<tbody>
<tr>
<td width="90" valign="top">
<p><b>Linux Switch</b></p>
</td>
<td width="90" valign="top">
<p><b>Windows <br />Switch</b></p>
</td>
<td valign="top">
<p><b>Purpose</b></p>
</td>
</tr>
<tr>
<td width="90" valign="top">
<p>-fmudflap<br />-ftrapuv</p>
</td>
<td width="90" valign="top">
<p>/RTC[su1]</p>
</td>
<td valign="top">
<p>Not Recommended.</p>
<p>Options on the compiler which add functionality similar to Intel Inspector XE can cause Intel Inspector XE to have false positives and false negatives.</p>
<p>-fmudflap switch is known to cause false positives and false negatives with Intel Inspector XE. <br />-ftrapuv is known to cause false negatives.  <br />/RTC[us] initializes uninitialized memory with a bit pattern preventing memory error analysis in Intel® Inspector XE from identifying uninitialized memory errors in your code.  There is some duplication in the kinds of errors that memory error analysis will do at Level 4 with this switch.</p>
<p>Switches such as this may impact performance without adding additional functionality.</p>
<p>These switches may be useful outside Intel Inspector XE - and may potentially catch additional issues that Intel Inspector XE does not find.</p>
<p>-fstack-security-check which add functionality similar to Intel Inspector XE is not known to cause false positives or false negatives with Intel Inspector XE.</p>
</td>
</tr>
<tr>
<td width="90" valign="top">
<p>-tprofile</p>
</td>
<td width="90" valign="top">/Qtprofile</td>
<td valign="top">
<p>Do not use.</p>
<p>This Intel Compiler setting is an alternative method of instrumentation for Intel® Thread Profiler. The instrumentation added by -tprofile is not supported by Intel Inspector XE.</p>
</td>
</tr>
<tr>
<td width="90" valign="top">
<p>-tcheck</p>
</td>
<td width="90" valign="top">/Qtcheck</td>
<td valign="top">
<p>Do not use.</p>
<p>This Intel Compiler setting is an alternative method of instrumentation for Intel® Thread Checker. The instrumentation added by -tcheck is not supported by Intel Inspector XE.</p>
</td>
</tr>
<tr>
<td width="90" valign="top">
<p>-msse4a, <br />-m3dnow</p>
</td>
<td width="90" valign="top">N/A</td>
<td valign="top">
<p>Do not use.</p>
<p>Binaries which use instructions not supported by Intel Processors may cause unknown behaviors in Intel Inspector XE.</p>
</td>
</tr>
<tr>
<td width="90" valign="top">
<p>-debug [keyword]</p>
</td>
<td width="90" valign="top">/debug<br />[keyword]</td>
<td valign="top">
<p>Not Recommended.</p>
<p>Intel Inspector XE works best with -debug full (the default). Other options including parallel, extended, emit-column, expr-source-pos, inline-debug-info, semantic-stepping, &amp; variable-locations are not supported by Intel Inspector XE.</p>
</td>
</tr>
</tbody>
</table>
<p><b></b></p>
<p><b>Settings which have no impact on memory error analysis of Intel Inspector XE:</b></p>
<table cellpadding="0" cellspacing="0" border="1">
<tbody>
<tr>
<td width="250" valign="top">
<p><b>Linux Switch</b></p>
</td>
<td width="250" valign="top"><strong>Windows Switch</strong></td>
<td valign="top">
<p><b>Purpose</b></p>
</td>
</tr>
<tr>
<td width="250" valign="top">
<p>-static<br />-static-libgcc<br />-static-intel<br />-shared-libgcc<br />-openmp-link</p>
</td>
<td width="250" valign="top">
<p>/MDd, /MD, /MT, MTd, Qopenmp-link</p>
</td>
<td valign="top">
<p>These setting directs the compiler to link in various libraries statically or dynamically. These switches impact Intel® Amplifier XE and threading error analysis for Intel Inspector XE. Memory error analysis in Intel Inspector XE works with statically linked libraries.</p>
</td>
</tr>
<tr>
<td width="250" valign="top">
<p>-DTBB_USE_THREADING_TOOLS</p>
</td>
<td width="250" valign="top">
<p>/DTBB_USE_THREADING_TOOLS</p>
</td>
<td valign="top">
<p>Setting TBB_USE_THREADING_TOOLS causes Intel Threading Building Blocks (TBB) to be instrumented. This switch impacts Intel® Amplifier XE and threading error analysis for Intel® Inspector XE. Setting _DEBUG or TBB_USE_DEBUG will in turn set TBB_USE_THREADING_TOOLS</p>
</td>
</tr>
<tr>
<td width="250" valign="top">N/A</td>
<td width="250" valign="top">
<p>/FIXED[:NO]</p>
</td>
<td valign="top">
<p>This setting allows binaries to be instrumented and is not required for Intel Inspector XE.</p>
</td>
</tr>
</tbody>
</table>
<p><b>Notes:</b> <br />1) Memory Error Analysis Level 1 (Memory Leak Detection) requires information in the executable and all shared libs in your application to properly walk the call stack:</p>
<p>a) Frame pointers: Use -fno-omit-frame-pointer.</p>
<p>b) Exception handling information: enabled via -fasynchronous-unwind-tables, -fexceptions, or -O0<br /><br />2) Using Debug versions of the Microsoft C Runtime Libraries (/MDd and /MTd) enables the Microsoft* debug heap manager. see: <a href="http://software.intel.com/en-us/articles/using-the-microsoft-debug-heap-manager-with-memory-error-analysis-of-intel-parallel-inspector/">Using the Microsoft* debug heap manager with memory error analysis of Intel® Parallel Inspector</a>.<br /><br />Note: There are other options which may add frame pointer or Exception handling to your binary as a side effect, Examples: -fexceptions (which is the default for C++).or -O0 . To make sure the executable (and shared libs) have this information, use the objdump -h &lt;binary&gt; command. You should see .eh_frame_hdr section there.</p>
<p><b>More Information:</b></p>
<p>This article addressed the most obvious switches that developers would have concerns over. Most switches will work with Intel® Parallel Inspector and/or Intel Inspector XE - but not every switch combination is tested. If you have information regarding other switches, please add a comment to this article. If you have question regarding a particular switch please submit an issue to the <a href="http://software.intel.com/en-us/forums/intel-inspector-xe/">Intel Inspector XE Forum</a>.</p>
<p><b>Versions:</b><br />Intel® Inspector XE 2011<br />Intel® C++ Compiler 11.X,12.X<br />GCC Compiler 3.4.6 – GCC 4.5.0<br />MS Visual Studio 2005, 2008, 2010</p> ]]></description>
      <link>http://software.intel.com/en-us/articles/compiler-settings-for-memory-error-analysis-in-intel-inspector-xe/</link>
      <pubDate>Tue, 05 Apr 2011 00:00:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/compiler-settings-for-memory-error-analysis-in-intel-inspector-xe/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/compiler-settings-for-memory-error-analysis-in-intel-inspector-xe/</guid>
      <category>Tools</category>
      <category>Intel SW Partner program</category>
      <category>Game Development</category>
    </item>
    <item>
      <title>How-to Analyze Linux* Applications with the Intel® Thread Profiler for Windows* </title>
      <description><![CDATA[ <p><b>Summary:</b></p>
<p class="sectionBody"><span class="sectionBody">The Intel<sup>®</sup> Thread Profiler feature of VTune™ Performance Analyzer for Windows* offers a powerful browser to examine data collected during instrumented runs of application code but unfortunately it only runs in the Microsoft Windows* environment.  However, the "Enabling collector for Linux*" enables users to do native data collection of their applications on their Linux machine and view those results on a Windows machine. This article describes the process.</span><br /><br /><b>Tools Needed:</b></p>
<ul>
<li>Intel VTune Performance Analyzer 9.1 for Windows* which includes 
<ul>
<li>Intel Thread Profiler for Windows*</li>
<li>Enabling collector for Linux* (tprofile_cl - which you get by installing  Tprofile3.1_XXXrdc_lin.tar.gz)</li>
</ul>
</li>
</ul>
<p >This collector is available in the VTune Peformance Analyzer product download area on <a href="http://registrationcenter.intel.com">http://registrationcenter.intel.com</a></p>
<ul>
<li>Intel<sup>®</sup> Compiler for Linux* (optional)</li>
</ul>
<p>For information on purchasing or evaluating these products please visit <a href="http://www.intel.com/software/products">http://www.intel.com/software/products</a>.</p>
<p><b>Binary Instrumentation:</b></p>
<p>What: Use for most IA-32 and Intel<sup>®</sup> 64 architecure based POSIX*, Intel<sup>®</sup> Threading Building Blocks, and OpenMP*<sup>§</sup> Applications.<sup></sup></p>
<p>How:</p>
<p>1)     Setup the Intel Thread Profiler Environment for IA-32 architecture by executing:</p>
<p># source /opt/intel/itt/tprofile/bin/32/tprofilevars.sh</p>
<p>or for Intel 64 architecture:<br /><br /># source /opt/intel/itt/tprofile/bin/32e/tprofilevars.sh</p>
<p>depending on your environment.</p>
<p>2)     Run the application using the Intel Thread Profiler Command Line <br /><br /># tprofile_cl <i>myapp</i></p>
<p>3)     Copy<sup>† </sup>the newly created results directory(default = threadprofiler) containing the .tp files back to the Windows system.</p>
<p >a.      Optional: Copy<sup>†</sup> the binaries(including any .so files) and source to the windows system</p>
<p>4)     Open<sup>†</sup> (File:Open) bistro.tp in Intel Thread Profiler for Windows.  If your application creates multiple processes... there will be a tprofile<i>.pid</i>.tp file for each process created by the main process.  You will have to open up the tprofile.<i>pid</i>.tp for the process you want to analyze.</p>
<p><sup>§</sup> Binary Instrumention for Intel Thread Profiler works better with the OpenMP* Compatibilty Libraries (dynamic version: libiomp5.so or libguide40.so) available via an Intel Compiler. This library has been instrumented for Intel Thread Profiler with the User-Level Synchronization API's. This library is used by default with the Intel Compiler, and can be used with an OpenMP* GCC* compiled application. If a 3rd party OpenMP* library is used, Thread Profiler can still collect data, but Intel Thread Profiler will not comprehend the OpenMP calls - it will be analyzed as a POSIX* application.<br /><br />Note: The OpenMP* compatibility library in Intel<sup>®</sup> C++ and Fortran Compilers 11.1 Update 1 (11.1.046) through 11.1 Update 3 (11.1.059) does not work with Intel Thread Profiler. You need an OpenMP library from either an earlier version or the OpenMP library which ships with Intel C++ and Fortran Compilers 11.1 Update 4 (11.1.061) and later.<br /><br /><b>Source Instrumentation:</b></p>
<p>What: Use for POSIX* applications when Binary Instrumentation is not practical (Servers for example) - or on Intel<sup>®</sup> Itanium<sup>®</sup> architecture systems.</p>
<p>How:</p>
<p>1)     Setup the Intel Thread Profiler Environment for IA-32 architecture by executing either:</p>
<p># source /opt/intel/itt/tprofile/bin/32/tprofilevars.sh</p>
<p>or for Intel 64 architecture:</p>
<p># source /opt/intel/itt/tprofile/bin/32e/tprofilevars.sh</p>
<p>depending on your environment.</p>
<p>2)     Compile the Application with the Intel<sup>®</sup> Compiler using the switch:<br /><br />-tprofile</p>
<p>3)     Run the Binary</p>
<p>4)     Copy<sup>†</sup> the generated .tp file and the .tpd file to the Windows* system.</p>
<p>5)     Copy<sup>†</sup> the binaries (including any .so files)</p>
<p >a.      Optional: Copy the source to the Windows* system</p>
<p>6)     Open<sup>†</sup> (File:Open) .tp in Intel Thread Profiler for Windows*</p>
<p> </p>
<p><b>OpenMP*-Specific: </b></p>
<p>What: This mode is specific to OpenMP* applications.</p>
<p>How:</p>
<p>1)     Compile the application with an Intel Compiler using the switch:<br /><br />-openmp_profile</p>
<p>2)     Run the application</p>
<p>3)     Copy<sup>†</sup> the generated guide.gvs file back to the Windows* system</p>
<p >a.      Optional: Copy<sup>†</sup> the binaries(including any .so files) and source to the Windows* system</p>
<p>4)     Open<sup>†</sup> (File:Open) guide.gvs in Intel Thread Profiler for Windows*. When it loads the data it will ask you the MHz of the system.  Enter the MHz of the Linux* system on which you ran the binary.</p>
<p><br /><sup>†</sup> Cross Mounting the Windows* system to the Linux* system is another perhaps easier way to copy/access files to the Windows* system.<b><br /><br />Useful Compiler Settings for Intel Thread Profiler:</b></p>
<p>
<table cellpadding="0" cellspacing="0" border="1">
<tbody>
<tr>
<td width="300" valign="top">
<p><b>Switch</b></p>
</td>
<td width="511" valign="top">
<p><b>Purpose</b></p>
</td>
</tr>
<tr>
<td width="300" valign="top">
<p>-g<br />(highly recommended)</p>
</td>
<td width="511" valign="top">
<p>Intel Thread Profiler uses the symbols to associate addresses to source lines.</p>
</td>
</tr>
<tr>
<td width="300" valign="top">
<p>"Release" Build (-O2)   <br />(highly recommended)</p>
</td>
<td width="511" valign="top">
<p>The time to execute a section of code may change if you don't use your normal production switches (Not -O0).  Potentially causing you to analyze and attempt optimization on a section of code that is not a performance problem.</p>
</td>
</tr>
<tr>
<td width="300" valign="top">
<p>-tprofile<br />(optional)</p>
</td>
<td width="511" valign="top">
<p>Use this setting to do Source Instrumentation.  Note: Only the Intel Compiler supports this setting</p>
</td>
</tr>
</tbody>
</table>
</p>
<p><b>Useful Settings for OpenMP* Applications compiled with the Intel Compiler for Intel Thread Profiler:</b></p>
<p>
<table cellpadding="0" cellspacing="0" border="1">
<tbody>
<tr>
<td width="300" valign="top">
<p><b>Switch</b></p>
</td>
<td width="537" valign="top">
<p><b>Purpose</b></p>
</td>
</tr>
<tr>
<td width="300" valign="top">
<p>-openmp<br />(required)</p>
</td>
<td width="537" valign="top">
<p>Use this setting if you are using Binary instrumentation analysis</p>
</td>
</tr>
<tr>
<td width="300" valign="top">
<p>-openmp_profile<br />(optional)</p>
</td>
<td width="537" valign="top">
<p>Use this setting if you are using OpenMP Specific Analysis</p>
</td>
</tr>
</tbody>
</table>
</p>
<p><br /><b>Useful Setting for applications using Intel Threading Building Blocks for Intel Thread Profiler:</b></p>
<p>
<table cellpadding="0" cellspacing="0" border="1">
<tbody>
<tr>
<td width="300" valign="top">
<p><b>Switch</b></p>
</td>
<td width="537" valign="top">
<p><b>Purpose</b></p>
</td>
</tr>
<tr>
<td width="300" valign="top">
<p>-D <br />"TBB_USE_THREADING_TOOLS"<br />(highly recommended)</p>
</td>
<td width="537" valign="top">
<p>This setting adds the User-Level Synchronization API's allowing Intel Thread Profiler to properly identify Intel Threading Building Blocks.</p>
</td>
</tr>
</tbody>
</table>
</p>
<p> </p> ]]></description>
      <link>http://software.intel.com/en-us/articles/how-to-analyze-linux-applications-with-the-intel-thread-profiler-for-windows/</link>
      <pubDate>Fri, 08 Jan 2010 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/how-to-analyze-linux-applications-with-the-intel-thread-profiler-for-windows/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/how-to-analyze-linux-applications-with-the-intel-thread-profiler-for-windows/</guid>
      <category>Parallel Programming</category>
      <category>Tools</category>
      <category>Intel SW Partner program</category>
      <category>Intel® Thread Profiler for Windows* Knowledge Base</category>
      <category>Code &amp; Downloads</category>
    </item>
  </channel></rss>
