<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated on Tue, 24 Nov 2009 23:59:27 -0800 -->
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <atom:link href="http://software.intel.com/en-us/articles/multi-core/type/code/feed/" rel="self" type="application/rss+xml" />
    <title>Intel Software Network articles feed</title>
    <link>http://software.intel.com/en-us/articles/multi-core/code//all</link>
    <description></description>
    <language>en-us</language>
    <item>
      <title>Threading Challenge 2009 - Phase 2 - #5:  3-D Convex Hull</title>
      <description><![CDATA[ <img src="http://software.intel.com/file/15368" alt="746_125.jpg" title="746_125.jpg" /><br /><br />Below you will find many of the entries received for our <strong>5th problem of phase 2 - 3-D Convex Hull</strong>.  Please feel free to review and join us in the <strong><a href="http://software.intel.com/en-us/forums/3-d-convex-hull/">forum</a></strong> dedicated to this problem to discuss.<br /><br /><br /><span class="sectionBodyText"><span class="sectionHeading">Winning Submission:<br /></span><br /><br /><span class="sectionHeadingText"><strong>*BradleyKuszmaul:</strong>  Code / Write-up</span> (to be posted soon)<br /><br /><br /><span class="sectionHeading">Other Entries:</span><br /><br />*<br /><br /></span> ]]></description>
      <link>http://software.intel.com/en-us/articles/threading-challenge-2009-phase-2-5-3-d-convex-hull</link>
      <pubDate>Thu, 12 Nov 2009 14:15:29 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/threading-challenge-2009-phase-2-5-3-d-convex-hull#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/threading-challenge-2009-phase-2-5-3-d-convex-hull</guid>
      <category>Parallel Programming</category>
    </item>
    <item>
      <title>Threading Challenge 2009 - Phase 2 - #4: The Travelling Baseball Fans</title>
      <description><![CDATA[ <img src="http://software.intel.com/file/15368" alt="746_125.jpg" title="746_125.jpg" /><br /><br />Below you will find many of the entries received for our <strong>4th problem of phase 2 - The Travelling Baseball Fans</strong>.  Please feel free to review and join us in the <strong><a href="http://software.intel.com/en-us/forums/travelling-baseball-fan/">forum</a></strong> dedicated to this problem to discuss.<br /><br /><br /><span class="sectionHeading">Winning Submission:<br /></span><br /><br /><span class="sectionHeadingText">*akki:  <a href="http://software.intel.com/file/23513">Code</a> / <a href="http://software.intel.com/file/23512">Write-up</a></span> <br /><br /><br /><span class="sectionHeading">Other Submissions:</span><br /><br /><br />*alinac:  Code / <a href='http://software.intel.com/file/23679'>Write-up</a><br /><br />*BradleyKuszmaul:  <a href='http://software.intel.com/file/23680'>Code</a> / <a href='http://software.intel.com/file/23681'>Write-up</a><br /><br />*avparate:  <a href='http://software.intel.com/file/23682'>Code</a>  / <a href='http://software.intel.com/file/23683'>Write-up</a><br /><br />*mdm100:  Code / Write-up<br /><br />*shikantaza:  Code / Write-up  ]]></description>
      <link>http://software.intel.com/en-us/articles/threading-challenge-2009-phase-2-4-the-traveling-baseball-fans</link>
      <pubDate>Mon, 02 Nov 2009 09:51:46 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/threading-challenge-2009-phase-2-4-the-traveling-baseball-fans#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/threading-challenge-2009-phase-2-4-the-traveling-baseball-fans</guid>
      <category>Parallel Programming</category>
    </item>
    <item>
      <title>Threading Challenge 2009 - Phase 2 - #3:  Graph Coloring</title>
      <description><![CDATA[ <img src="http://software.intel.com/file/15368" alt="746_125.jpg" title="746_125.jpg" /><br /><br />Below you will find many of the entries received for our <strong>3rd problem of phase 2 - Graph Coloring</strong>.  Please feel free to review and join us in the <strong><a href="http://software.intel.com/en-us/forums/graph-coloring/">forum</a></strong> dedicated to this problem to discuss.<br /><br /><br /><span class="sectionHeading">Winning Submission:<br /></span><br /><br /><span class="sectionHeadingText">*akki:  <a href="http://software.intel.com/file/23129">Code</a> / <a href="http://software.intel.com/file/23130">write-up</a> <br /></span><br /><br /><span class="sectionHeading">Other Submissions:</span><br /><br /><br />*mdm100:  Code / Write-up<br /><br />*BradleyKuszmaul:  <a href='http://software.intel.com/file/23684'>Code</a> / <a href='http://software.intel.com/file/23685'>Write-up</a><br /><br />*avparate:  Code / Write-up<br /> ]]></description>
      <link>http://software.intel.com/en-us/articles/threading-challenge-2009-phase-2-3-graph-coloring</link>
      <pubDate>Fri, 16 Oct 2009 13:46:34 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/threading-challenge-2009-phase-2-3-graph-coloring#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/threading-challenge-2009-phase-2-3-graph-coloring</guid>
      <category>Parallel Programming</category>
    </item>
    <item>
      <title>Threading Challenge 2009 - Phase 2 - #2: Knights Tour</title>
      <description><![CDATA[ <img src="http://software.intel.com/file/15368" alt="746_125.jpg" title="746_125.jpg" /><br /><br />Below you will find many of the entries received for our <strong>2nd problem of phase 2 - Knights Tour</strong>.  Please feel free to review and join us in the <a href="http://software.intel.com/en-us/forums/strassens-algorithm/"><strong>forum</strong></a> dedicated to this problem to discuss.<br /><br /><br /><span class="sectionHeading">Winning Submission:</span><br /><br /><br /><span class="sectionHeadingText">*mdm100: </span><a href="http://software.intel.com/file/22720"> <strong>Code</strong> </a>/ <a href="http://software.intel.com/file/22721"><strong>Write-up</strong></a><br /><br /><br /><br /><span class="sectionHeading">Other Submissions:<br /></span><br /><br />*Still to come! ]]></description>
      <link>http://software.intel.com/en-us/articles/threading-challenge-2009-phase-2-2-knights-tour</link>
      <pubDate>Thu, 01 Oct 2009 15:48:43 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/threading-challenge-2009-phase-2-2-knights-tour#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/threading-challenge-2009-phase-2-2-knights-tour</guid>
      <category>Parallel Programming</category>
    </item>
    <item>
      <title>Threading Challenge 2009 - Problem 1: Radix Sort Entries</title>
      <description><![CDATA[ <p><img src="http://software.intel.com/file/15359" /><br /><br />Below you will find many of the entries received for our <strong>1st problem - Radix Sort</strong>.  Please feel free to review and join us in the <a href="http://software.intel.com/en-us/forums/radix-sort/"><strong><span style="text-decoration: underline;">forum</span></strong></a> dedicated to this problem to discuss.<br /><br /><br /><span class="sectionHeading">Winning Submission:</span>  <br /><br /><strong class="sectionHeadingText">*denghui0185:  </strong><a href="http://software.intel.com/file/22036" class="sectionHeadingText">Code</a><span class="sectionHeadingText"> / </span><a href="http://software.intel.com/file/21777" class="sectionHeadingText">English Write-up</a><span class="sectionHeadingText"> /</span><a href="http://software.intel.com/file/21776" class="sectionHeadingText"> Mandarin Write-up</a><br /><br /><br /><span class="sectionHeading">Other Submissions: </span><br /><br /><strong>*akki:  </strong><a href="http://software.intel.com/file/21775">Code</a> / <a href="http://software.intel.com/file/21774">Write-up</a><br /><br /><strong>*ikipou:</strong>  <a href="http://software.intel.com/file/22388">Code</a> / <a href="http://software.intel.com/file/22389">Write-up</a><br /><br /><strong>*jne100:</strong>  <a href="http://software.intel.com/file/22390">Code</a> / <a href="http://software.intel.com/file/22391">Write-up</a><br /><br />*andreyryabov:  <a href='http://software.intel.com/file/22395'>Code</> / <a href='http://software.intel.com/file/22396'>Write-up</a><br /><br />*Dmitriy Vyukov:  <a href='http://software.intel.com/file/22397'>Code</a> / <a href='http://software.intel.com/file/22398'>Write-up</a><br /><br />*pfrey:  Code / Write-up<br /><br />*licstar:  Code / Write-up<br /><br />*emacswu:  Code / Write-up<br /><br />*dweeberlyloom:  Code / Write-up<br /><br />*hoajn:  Code / Write-up<br /><br />*nickbes:  Code / Write-up<br /><br />*adrcto:  Code / Write-up<br /><br />*m_kirov:  Code / Write-up</p> ]]></description>
      <link>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-1-radix-sort-entries</link>
      <pubDate>Tue, 11 Aug 2009 16:30:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-1-radix-sort-entries#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/threading-challenge-2009-problem-1-radix-sort-entries</guid>
      <category>Parallel Programming</category>
      <category>Intel Software Network communities</category>
    </item>
    <item>
      <title>Threading Challenge 2009 - Problem 6:  Line Segment Intersection Entries</title>
      <description><![CDATA[ <img src="http://software.intel.com/file/15359" /><br /><br />Below you will find many of the entries received for our <strong>6th problem - Line Segment Intersection.</strong>  Please feel free to review and join us in the <a href="http://software.intel.com/en-us/forums/line-segments/"><strong><span style="text-decoration: underline;">forum</span></strong></a> dedicated to this problem to discuss.<br /><br /><br /><br /><span class="sectionHeading">Winning Submission:</span><strong><span class="sectionHeading">  </span><br /><br /><br />*BradleyKuszmaul:  </strong><a href="javascript:void(0)" onclick="ndownload('http://software.intel.com/file/21625')">Code</a> / <a href="javascript:void(0)" onclick="ndownload('http://software.intel.com/file/21626')">Write-up</a><br /><br /><br /><br /><br /><span class="sectionHeading">Other Submissions (more to be added soon): <br /></span><br /><br /><strong>*akki:</strong>  <a href="javascript:void(0)" onclick="ndownload('http://software.intel.com/file/21627')">Code</a> / <a href="javascript:void(0)" onclick="ndownload('http://software.intel.com/file/21628')">Write-up</a><br /><br /><strong>*denghui0815:</strong>  <a>Code</a> / <a href="http://software.intel.com/file/22472">Write-up</a> (Mandarin)<br /><br /><strong>*Dmitriy Vyukov:</strong>  <a href="http://software.intel.com/file/22473">Code</a> / <a href="http://software.intel.com/file/22470">Write-up</a><br /><br /><strong>*mikhailsemenov:</strong>  <a href="http://software.intel.com/file/22474">Code</a> / <a href="http://software.intel.com/file/22471">Write-up</a> ]]></description>
      <link>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-6-line-segment-intersection-entries</link>
      <pubDate>Tue, 11 Aug 2009 16:24:39 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-6-line-segment-intersection-entries#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/threading-challenge-2009-problem-6-line-segment-intersection-entries</guid>
      <category>Parallel Programming</category>
      <category>Intel Software Network communities</category>
    </item>
    <item>
      <title>Threading Challenge 2009 - Problem 5:  Knapsack Entries</title>
      <description><![CDATA[ <img src="http://software.intel.com/file/15359" /><br /><br />Below you will find many of the entries received for our <strong>5th problem - Knapsack.</strong>  Please feel free to review and join us in the <a href="http://software.intel.com/en-us/forums/knapsack-problem/"><strong><span style="text-decoration: underline;">forum</span></strong></a> dedicated to this problem to discuss.<br /><br /><br /><br /><span class="sectionHeading">Winning Submission:</span><strong><span class="sectionHeading"> </span> <br /><br /><br /><span class="sectionHeadingText">*matteocilk:  </span></strong><a href="http://software.intel.com/file/21779" class="sectionHeadingText">Code / </a><a href="http://software.intel.com/file/21780" class="sectionHeadingText">Write-up</a><br /><br /><br /><br /><span class="sectionHeading">Other Submissions: </span><br /><br /><br /><strong>*denghui0815:</strong>  <a href="http://software.intel.com/file/22468">Code</a> / <a href="http://software.intel.com/file/22465">Write-up</a> (Mandarin)<br /><br /><strong>*haojn:</strong>  <a href="http://software.intel.com/file/22467">Code</a> / <a href="http://software.intel.com/file/22464">Write-up</a><br /><br /><strong>*Dmitriy Vyukov:</strong>  <a href="http://software.intel.com/file/22469">Code</a> / <a href="http://software.intel.com/file/22466">Write-up</a> ]]></description>
      <link>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-5-knapsack-entries</link>
      <pubDate>Tue, 11 Aug 2009 16:11:21 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-5-knapsack-entries#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/threading-challenge-2009-problem-5-knapsack-entries</guid>
      <category>Parallel Programming</category>
      <category>Intel Software Network communities</category>
    </item>
    <item>
      <title>Threading Challenge 2009 - Problem 3:  Searching Entries</title>
      <description><![CDATA[ <img src="http://software.intel.com/file/15359" /><br /><br /><span class="sectionBodyText">Below you will find many of the entries received for our <strong>3rd problem - Searching.</strong>  Please feel free to review and join us in the </span><a href="http://software.intel.com/en-us/forums/searching/"><strong class="sectionBodyText"><span style="text-decoration: underline;">forum</span></strong></a><span class="sectionBodyText"> dedicated to this problem to discuss.</span><br /><br /><br /><span class="sectionHeading">Winning Submission:</span><strong> <br /><br /><br /><span class="sectionHeadingText">*denghui0185:  </span></strong><a href="http://software.intel.com/file/22037" class="sectionHeadingText">Code </a><span class="sectionHeadingText">/ </span><a href="http://software.intel.com/file/21783" class="sectionHeadingText">Write-up Mandarin</a><br /><br /><br /><span class="sectionBodyText"><span class="sectionHeading">Other Submissions:<br /></span><br /><br /><strong>*akki: </strong><a href="http://software.intel.com/file/21781">Code</a> / <a href="http://software.intel.com/file/21782">Write-up</a><br /><br />*guzheng2000:  Code / Write-up<br /><br />*andreyryabov:  Code / Write-up<br /><br />*pfrey:  Code / Write-up<br /><br />*calebe:  Code / Write-up<br /><br />*hpc_2002:  Code / Write-up<br /><br />*Dmitriy Vyukov:  Code / Write-up<br /><br />*jne100:  Code / Write-up<br /><br />*dweeberlyloom:  Code / Write-up<br /><br />*iarchitect:  Code / Write-up</span> ]]></description>
      <link>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-3-searching-entries</link>
      <pubDate>Tue, 11 Aug 2009 16:04:21 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-3-searching-entries#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/threading-challenge-2009-problem-3-searching-entries</guid>
      <category>Parallel Programming</category>
      <category>Intel Software Network communities</category>
    </item>
    <item>
      <title>Threading Challenge 2009 - Problem 4:  String Matching Entries</title>
      <description><![CDATA[ <img src="http://software.intel.com/file/15359" /><br /><br />Below you will find many of the entries received for our <strong>4th problem - String Matching.</strong>  Please feel free to review and join us in the <a href="http://software.intel.com/en-us/forums/string-matching/"><strong><span style="text-decoration: underline;">forum</span></strong></a> dedicated to this problem to discuss.<br /><br /><br /><br /><span class="sectionHeading">Winning Submission:</span><strong>  <br /><br /><br /><span class="sectionHeadingText">*BradleyKuszmaul:  </span></strong><span class="sectionHeadingText"> </span><a href="javascript:void(0)" onclick="ndownload('http://software.intel.com/file/21632')" class="sectionHeadingText">Code</a><span class="sectionHeadingText"> / </span><a href="javascript:void(0)" onclick="ndownload('http://software.intel.com/file/21631')" class="sectionHeadingText">Write-up</a><br /><br /><br /><br /><span class="sectionBodyText"><span class="sectionHeading">Other Submissions:<br /></span><br /><br /><strong>*akki:</strong>  <a href="javascript:void(0)" onclick="ndownload('http://software.intel.com/file/21630')">Code</a> / <a href="javascript:void(0)" onclick="ndownload('http://software.intel.com/file/21629')">Write-up</a><br /><br /><strong>*jne100:</strong>  <a href="http://software.intel.com/file/22461">Code</a> / <a href="http://software.intel.com/file/22460">Write-up</a><br /><br /><strong>*haojn:</strong>  <a href="http://software.intel.com/file/22462">Code</a> / <a href="http://software.intel.com/file/22463">Write-up</a><br /><br /><strong>*denghui0815:</strong>  Code / <a href="http://software.intel.com/file/22459">Write-up</a><br /><br /><strong>*Sergii Biloshytskyi:</strong>  Code / Write-up<br /><br /><strong>*javadude:</strong>  Code / Write-up</span> ]]></description>
      <link>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-4-string-matching-entries</link>
      <pubDate>Tue, 11 Aug 2009 15:46:03 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/threading-challenge-2009-problem-4-string-matching-entries#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/threading-challenge-2009-problem-4-string-matching-entries</guid>
      <category>Parallel Programming</category>
      <category>Intel Software Network communities</category>
    </item>
    <item>
      <title>Using CPUID to Detect the presence of SSE 4.1 and SSE 4.2 Instruction Sets</title>
      <description><![CDATA[ <p class="sectionHeading">Download PDF</p>
<p><a href="http://software.intel.com/file/21560">Using CPUID to Detect the presence of SSE 4.1 and SSE 4.2 Instruction Sets</a> [PDF | 132kb]</p>
<p class="sectionHeading">Introduction</p>
<p>Several  application notes have been written by Intel to assist customers with  discerning which processor their application is running on and the features  supported by a particular processor.   This information may then be used to choose appropriate code paths for  processor specific optimizations, or to selectively enable features based on  processing power.</p>
<p>In  this application note, a set of code sequences is shown to determine if the  processor being queried supports the SSE 4.1 and SSE 4.2 instruction sets  .  The code in this application note was  designed to run on Intel 64 Architecture processors running a 32 bit or 64 bit  Windows or Linux Operating System.  The  code, as shown is designed to be compiled with the Intel compiler, although,  only minor changes,  would be required to  compile the code on other compilers.</p>
<p>At  least two prior reference articles exist that cover or touch the CPUID  topic.  The two referenced for this  application note are listed below.</p>
<p>App  Note 485, “Intel® Processor Identification and the CPUID Instruction<a href="#_edn1" name="_ednref1" id="_ednref1"> </a>”  explains in depth how to distinguish the various Intel Architecture processors  starting with the original 8086. Several customers have requested assistance  with CPUID code sequences that will operate under more constrained  cicrumstances and thus, can be simplified substantially compared to the general  assumptions made in App Note 485.</p>
<p>Another  Intel reference is an article titled ”Intel® 64 Architecture Processor Topology  Enumeration.ii ”  This article covers much more  than CPUID.  However, it contains code  for the CPUID sequence that is much simpler for our usage, and so it is also  listed as a reference.</p>
<p>It should be  noted that the Intel® compiler also supports functionality that removes the  burden of CPUID coding from the user and may be preferable. The Intel compiler provides  the capability to automatically generate  multiple code paths and generate the appropriate CPUID code sequence and  runtime code path selection code on a per function basis.  The user can specify which functions and  should have specific code paths and for which target processors these specific  code paths should be generated.  The user  may write the code for each code path or rely on the compilers auto vectorization  capability.  This topic is beyond the  scope of this app note and is not covered further here.</p>
<p class="sectionHeading">Usage Guidelines</p>
<p>This  code illustrates determining if a particular processor is an Intel processor  which supports the SSE 4.1 and the SSE 4.2 instruction sets. operating under  the following conditions:</p>
<ol>
<li>The target processor must be a 32 bit capable processor.  The presence of the CPUID instruction is  determined by checking the ability to toggle bit 21 of the EFLAGS register as  specified in the section “Detecting the CPUID Instruction” in Application note  485. </li>
<li>Running a 32 bit or 64 bit Windows* or Linux* operating  system.  (The general principles apply to  other operating systems, but may required code modification in order for the  code to compile and function correctly due to potential differences in the  Application Binary Interfaces (ABIs) of other operationg systems.)</li>
<li>Compiled with the Intel® Compiler for the desired target.</li>
<li>Use of the –use-msasm switch with the Intel® compiler when  the target is Linux.  This switch allows  the usage of Microsoft assembly syntax preventing the need to have different  versions of source code for the two operating systems.  This may not be generally possible because of  the differences in ABI (application binary interface) between Linux and Windows  but is a successful strategy when applicable.   Note that the Intel® Compiler is also capable of compiling GNU style  assembly code for window targets, though all assembly code in this application  note is windows style.</li>
</ol>
<p class="sectionHeading">Performance</p>
<p>This  code is not in and of itself designed to be high performance.  CPUID is not a fast executing instruction.  Therefore, it should not be called on a  regular basis to determine code path choices when more than one path based on  optimization strategy is provided.   Instead, this code should be called once at initialization time and the  result stored and used to load the correct shared library, or set a global  variable to check for code path determination.</p>
<p class="sectionHeading">Conclusion</p>
The source code provided illustrates that it is fairly  simple to determine whether a processor supports the SSE 4.1 and SSE 4.2  instruction set.  The code can be easily  modified to detect other features designated by other CPUID feature bits by  referring to the Intel Software Developers Manual.    <br /><br />
<p class="sectionHeading">Source Code</p>
<pre name="code" class="cpp"> <br /><br />/************ Beginning of source file sse41andsse42detection.cpp ********************/<br />/*	Copyright 2009 Intel Corporation <br /> *	sse41andsse42detection.cpp <br /> *	This file uses code first published by Intel as part of the processor enumeration<br /> *	article available on the internet at:<br /> *	http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-            *	enumeration/<br /> *	Some of the original code from cpu_topo.c<br /> *	has been removed, while other code has been added to illustrate the CPUID usage<br /> * 	to determine if the processor supports the SSE 4.1 and SSE 4.2 instruction sets.<br /> *	The reference code provided in this file is for demonstration purpose only. It assumes<br /> *	the hardware topology configuration within a coherent domain does not change during<br /> *	the life of an OS session. If an OS support advanced features that can change <br /> *	hardware topology configurations, more sophisticated adaptation may be necessary<br /> *	to account for the hardware configuration change that might have added and reduced <br /> *  	the number of logical processors being managed by the OS.<br /> *<br /> *	Users of this code should be aware that the provided code<br /> *	relies on CPUID instruction providing raw data reflecting the native hardware <br /> *	configuration. When an application runs inside a virtual machine hosted by a <br /> *	Virtual Machine Monitor (VMM), any CPUID instructions issued by an app (or a guest OS) <br /> *	are trapped by the VMM and it is the VMM's responsibility and decision to emulate  <br /> *	CPUID return data to the virtual machines. When deploying topology enumeration code based<br />*	on CPUID inside a VM environment, the user must consult with the VMM vendor on how an VMM<br /> *	will emulate CPUID instruction relating to topology enumeration.<br /> *<br /> *	Original code written by Patrick Fay, Ronen Zohar and Shihjong Kuo .<br /> * 	Modified by Garrett Drysdale for current application note.<br /> */<br /><br />#include "sse41andsse42detection.h"<br /><br />#define SSE4_1_FLAG		0x080000<br />#define SSE4_2_FLAG		0x100000<br /><br />int isSSE41andSSE42Supported (void)<br />{<br />	// returns 1 if is a Nehalem or later processor, 0 if prior to Nehalem<br /><br />	CPUIDinfo Info;<br />	int rVal = 0;<br />	// The code first determines if the processor is an Intel Processor.  If it is, then <br />// feature flags bit 19 (SSE 4.1) and 20 (SSE 4.2) in ECX after CPUID call with EAX = 0x1<br />	// are checked.<br />	// If both bits are 1 (indicating both SSE 4.1 and SSE 4.2 exist) then <br />// the function returns 1 <br />	const int CHECKBITS = SSE4_1_FLAG | SSE4_2_FLAG;<br /><br />	if (isGenuineIntel() &gt;= 1)<br />	{<br />	   	// execute CPUID with eax (leaf) = 1 to get feature bits, <br />// subleaf doesn't matter so set it to zero<br />		get_cpuid_info(&amp;Info, 0x1, 0x0);<br />		if ((Info.ECX &amp; CHECKBITS) == CHECKBITS)<br />		{<br />			rVal = 1;<br />		}<br />	}<br />	return(rVal);<br />}<br /><br />int isGenuineIntel (void)<br />{<br />	// returns largest function # supported by CPUID if it is a Geniune Intel processor AND it supports<br />	// the CPUID instruction, 0 if not<br />	CPUIDinfo Info;<br />	int rVal = 0;<br />	char procString[] = "GenuineIntel";<br />	<br />	if (isCPUIDsupported())<br />	{<br />		// execute CPUID with eax = 0, subleaf doesn't matter so set it to zero<br />		get_cpuid_info(&amp;Info, 0x0, 0x0);<br />		if ((Info.EBX == ((int *)procString)[0]) &amp;&amp; \<br />(Info.EDX == ((int *)procString)[1]) &amp;&amp; (Info.ECX == ((int *)procString)[2]))<br />		{<br />			rVal = Info.EAX;<br />		}<br />	}<br />	return(rVal);<br />}<br /><br />#if (defined(__x86_64__) || defined(_M_X64))<br />// This code is assembly for 64 bit target OS.<br />// Assembly code must be compiled with the –use-msasm switch for Linux targets with the <br />// Intel compiler. <br />int isCPUIDsupported (void)<br />{<br />	// returns 1 if CPUID instruction supported on this processor, zero otherwise<br />	// This isn't necessary on 64 bit processors because all 64 bit processor support CPUID<br />	return((int) 1);<br />}<br /><br />void get_cpuid_info (CPUIDinfo *Info, const unsigned int leaf, const unsigned int subleaf)<br />{<br />	// Stores CPUID return Info in the CPUIDinfo structure.<br />	// leaf and subleaf used as parameters to the CPUID instruction<br />	// parameters and register usage designed to be safe for both Windows and Linux<br />	// Use the Intel compiler option -use-msasm when the target is Linux<br />	__asm <br />	{<br />		mov r10d, subleaf	; arg2, subleaf (in R8 on WIN, in RDX on Linux)<br />		mov r8, Info		; arg0, array addr (in RCX on WIN, in RDI on Linux)<br />		mov r9d, leaf		; arg1, leaf (in RDX on WIN, in RSI on Linux)<br />		push rax<br />		push rbx<br />		push rcx<br />		push rdx<br />		mov eax, r9d<br />		mov ecx, r10d<br />		cpuid<br />		mov	DWORD PTR [r8], eax<br />		mov	DWORD PTR [r8+4], ebx<br />		mov	DWORD PTR [r8+8], ecx<br />		mov	DWORD PTR [r8+12], edx<br />		pop rdx<br />		pop rcx<br />		pop rbx<br />		pop rax<br />	}<br />}<br /><br />#else	// 32 bit<br />//Note need to make sure -use-msasm switch is used with Intel compiler for Linux to get the<br />// ASM code to compile for both windows and linux with one version source<br /><br />int isCPUIDsupported (void)<br />{<br />	// returns 1 if CPUID instruction supported on this processor, zero otherwise<br />	// This isn't necessary on 64 bit processors because all 64 bit Intel processors support CPUID<br />	__asm <br />	{<br />		push ecx ; save ecx<br />		pushfd ; push original EFLAGS<br />		pop eax ; get original EFLAGS<br />		mov ecx, eax ; save original EFLAGS<br />		xor eax, 200000h ; flip bit 21 in EFLAGS<br />		push eax ; save new EFLAGS value on stack<br />		popfd ; replace current EFLAGS value<br />		pushfd ; get new EFLAGS<br />		pop eax ; store new EFLAGS in EAX<br />		xor eax, ecx ; Bit 21 of flags at 200000h will be 1 if CPUID exists<br />		shr eax, 21	 ; Shift bit 21 bit 0 and return it<br />		push ecx<br />		popfd ; restore bit 21 in EFLAGS first<br />		pop ecx	; restore ecx<br />	}<br />}<br /><br />//Note need to make sure -use-msasm switch is used with Intel compiler for Linux to get the<br />// ASM code to compile for both windows and linux with one version source<br />void get_cpuid_info (CPUIDinfo *Info, const unsigned int leaf, const unsigned int subleaf)<br />{<br />	// Stores CPUID return Info in the CPUIDinfo structure.<br />	// leaf and subleaf used as parameters to the CPUID instruction<br />	// parameters and registure usage designed to be safe for both Win and Linux<br />	// when using -use-msasm<br />	__asm <br />	{<br />		mov	edx, Info   ; addr of start of output array<br />		mov	eax, leaf  ; leaf<br />		mov	ecx, subleaf  ; subleaf<br />		push edi<br />		push ebx<br />		mov  edi, edx                      ; edi has output addr<br />		cpuid<br />		mov	DWORD PTR [edi], eax<br />		mov	DWORD PTR [edi+4], ebx<br />		mov	DWORD PTR [edi+8], ecx<br />		mov	DWORD PTR [edi+12], edx<br />		pop ebx<br />		pop edi<br />		ret<br />	}<br />}<br />#endif<br />/************ End of source file sse41andsse42detection.cpp *******************************/<br />/************ Beginning of source file sse41andsse42detection.h ***************************/<br />/*		Copyright 2008 Intel Corporation <br /> *	The source code contained or described herein and all documents related <br /> *  	to the source code ("Material") are owned by Intel Corporation or <br /> *	its suppliers or licensors. Use of this material must comply with the <br /> *	rights and restrictions set forth in the accompnied license terms set<br /> *  	forth in file "license.rtf".<br /> *<br /> *	Original code contained in cputopology.h.<br /> * 	This file has been renamed to cpuid.h for this app note, code removed, and some <br /> * 	code added.<br /> *<br /> *  	This is the header file that contain type definitions <br /> *  	and prototypes of functions in the file cpuid.cpp<br /> *	The source files can be compiled under 32-bit and 64-bit Windows and Linux.<br /> *  <br /> *	Original code written by Patrick Fay and Shihjong Kuo <br /> * 	Modified by Garrett Drysdale for this application note.<br /> */<br /><br />typedef struct <br />{<br />	unsigned __int32 EAX,EBX,ECX,EDX;<br />} CPUIDinfo;<br /><br />void get_cpuid_info (CPUIDinfo *, const unsigned int, const unsigned int);<br />int isCPUIDsupported (void);<br />int isGenuineIntel (void);<br />int isSSE41andSSE42Supported (void);<br />/************ End of source file sse41andsse42detection.h ****************************/</pre>
<br />
<p class="sectionHeading">References</p>
<ul>
<li>App Note 485, “<a href="http://www.intel.com/Assets/PDF/appnote/241618.pdf">Intel® Processor Identification and the CPUID Instruction</a>”.</li>
<li>Intel  article titled ”<a href="http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/">Intel® 64 Architecture Processor Topology Enumeration</a>”.</li>
</ul>
<br />
<p class="sectionHeading">License  Agreement</p>
<p align="center"><b>Intel®  Source Code License Agreement</b></p>
<p><b>This  license governs use of the accompanying software. By installing or copying all  or any part of the software components in this package, you (“you” or  “Licensee”) agree to the terms of this agreement.  Do not install or copy the software until you  have carefully read and agreed to the following terms and conditions.  If you do not agree to the terms of this  agreement, promptly return the software to Intel Corporation (“Intel”).</b></p>
<p>1. <b>Definitions:</b></p>
<ol>
<li> <ol type="A">
<li>“Materials"  are defined as the software (including the Redistributables and Source as  defined herein), documentation, and other materials, including any updates and  upgrade thereto, that are provided to you under this Agreement.</li>
<li>"Redistributables"  are the binary files listed in the "redist.txt" file that is included  in the Materials or are otherwise clearly identified as redistributable files  by Intel.</li>
<li>“Source” is the source code file(s)  that: (i) demonstrate(s) certain functions for particular purposes; (ii) are  identified as source code; and (iii) are provided hereunder in source code  form. </li>
<li>“Intel’s  Licensed Patent Claims” means those claims of Intel’s patents that (a) are  infringed by the Source or Redistributables, alone and not in combination, in  their unmodified form, as furnished by Intel to Licensee and (b) Intel has the  right to license.</li>
</ol></li>
</ol>
<p>2. <b>License  Grant: </b>Subject to all of the terms  and conditions of this Agreement:<b><br /> </b></p>
<ol type="A">
<li>Intel  grants to you a non-exclusive, non-assignable, copyright license to use the  Material for your internal development purposes only.</li>
<li>Intel  grants to you a non-exclusive, non-assignable copyright license to reproduce the   Source, prepare derivative works of the   Source and distribute the  Source  or any derivative works thereof that you create, as part of the product  or application you develop using the Materials.</li>
<li>Intel grants to you a non-exclusive, non-assignable  copyright license to distribute the Redistributables in binary form, or any  portions thereof, as part of the product or application you develop using the  Materials.</li>
<li>Intel grants Licensee a non-transferable,  non-exclusive, worldwide, non-sublicenseable license under Intel’s Licensed  Patent Claims to make, use, sell, and import the Source and the  Redistributables.</li>
</ol>
<p>3. <b>Conditions  and Limitations:<br /> </b></p>
<ol type="A">
<li>This license does not grant you any rights  to use Intel’s name, logo or trademarks.</li>
<li>Title to the Materials and all copies  thereof remain with Intel.  The Materials  are copyrighted and are protected by United States copyright laws.  You will not remove any copyright notice from  the Materials.  You agree to prevent any  unauthorized copying of the Materials.   Except as expressly provided herein, Intel does not grant any express or  implied right to you under Intel patents, copyrights, trademarks, or trade  secret information.</li>
<li>You may NOT:  (i) use or copy the Materials except as  provided in this Agreement; (ii) rent or lease the Materials to any third  party; (iii) assign this Agreement or transfer the Materials without the  express written consent of Intel; (iv) modify, adapt, or translate the  Materials in whole or in part except as provided in this Agreement; (v) reverse  engineer, decompile, or disassemble the Materials not provided to you in source  code form; or (vii) distribute, sublicense or transfer the source code form of  any components of the Materials and derivatives thereof to any third party  except as provided in this Agreement.
<p> </p>
</li>
</ol>
<p>4. <b>No Warranty:</b></p>
<p><b>THE  MATERIALS ARE PROVIDED “AS IS”.  INTEL  DISCLAIMS ALL EXPRESS OR IMPLIED WARRANTIES WITH RESPECT TO THEM, INCLUDING ANY  IMPLIED WARRANTIES OF MERCHANTABILITY, NON-INFRINGEMENT, AND FITNESS FOR ANY  PARTICULAR PURPOSE.</b></p>
<p>5. <b>Limitation of  Liability:  NEITHER INTEL NOR ITS  SUPPLIERS SHALL BE LIABLE FOR ANY DAMAGES WHATSOEVER (INCLUDING, WITHOUT  LIMITATION, DAMAGES FOR LOSS OF BUSINESS PROFITS, BUSINESS INTERRUPTION, LOSS  OF BUSINESS INFORMATION, OR OTHER LOSS) ARISING OUT OF THE USE OF OR INABILITY  TO USE THE SOFTWARE, EVEN IF INTEL HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH  DAMAGES.  BECAUSE SOME JURISDICTIONS  PROHIBIT THE EXCLUSION OR LIMITATION OF LIABILITY FOR CONSEQUENTIAL OR  INCIDENTAL DAMAGES, THE ABOVE LIMITATION MAY NOT APPLY TO YOU.</b></p>
<p>6. <b>USER SUBMISSIONS</b>:   You agree that any material, information or other communication,  including all data, images, sounds, text, and other things embodied therein,  you transmit or post to an Intel website or provide to Intel under this  Agreement will be considered non-confidential  ("Communications").  Intel will  have no confidentiality obligations with respect to the Communications.  You agree that Intel and its designees will  be free to copy, modify, create derivative works, publicly display, disclose,  distribute, license and sublicense through multiple tiers of distribution and  licensees, incorporate and otherwise use the Communications, including  derivative works thereto, for any and all commercial or non-commercial purposes</p>
<p>7. <b>TERMINATION OF THIS LICENSE</b>: This  Agreement becomes effective on the date you accept this Agreement and will  continue until terminated as provided for in this Agreement.  Intel may terminate this license at any time  if you are in breach of any of its terms and conditions.  Upon termination, you will immediately return  to Intel or destroy the Materials and all copies thereof.</p>
<p>8. <b>U.S.</b><b> GOVERNMENT  RESTRICTED RIGHTS</b>: The Materials are provided with "RESTRICTED  RIGHTS". Use, duplication or disclosure by the Government is subject to  restrictions set forth in FAR52.227-14 and DFAR252.227-7013 et seq. or its  successor.  Use of the Materials by the  Government constitutes acknowledgment of Intel's rights in them.</p>
<p>9. <b>APPLICABLE LAWS</b>: Any claim arising under or  relating to this Agreement shall be governed by the internal substantive laws  of the State of Delaware,  without regard to principles of conflict of laws.  You may not export the Materials in violation  of applicable export laws.</p> ]]></description>
      <link>http://software.intel.com/en-us/articles/using-cpuid-to-detect-the-presence-of-sse-41-and-sse-42-instruction-sets</link>
      <pubDate>Wed, 05 Aug 2009 17:35:24 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/using-cpuid-to-detect-the-presence-of-sse-41-and-sse-42-instruction-sets#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/using-cpuid-to-detect-the-presence-of-sse-41-and-sse-42-instruction-sets</guid>
      <category>Parallel Programming</category>
    </item>
  </channel></rss>