Detecting Nehalem CPU

dark_shikari
Total Points:
260
Status Points:
210
Green Belt
October 13, 2008 8:48 PM PDT
Rate
 
#4 Reply to #3

Hi

I assume your interest in incorporating microarchitecture-specific optimization is the motivation for seeking family/model information. If I understand correctly, you might be pursuing SSE4 code in your encoding application with different code paths to optimize for Penryn versus Nehalem microarchitecture.

You can find in Appendix C of the Optimization manual, Table C-1 lists SIMD instruction set support for different processor families. However, Table C-1 only covers processor families that have launched. Nehalem family will be rolling out soon, so we will be updating our documentation in the near future. But at the current time, that still falls under unreleased product information.

Please note that, un-released product detail is generally not covered in public docs, but may be available under non-disclosure channel. The channel to get access to un-released product information is the same as you would with your Nehalem prototype system.

Additionally, some of the newer processor family may span more than one value of "enhanced model"/"model" encoding. For example, the six-core Dunnington processor, known as Intel Xeon processor 7400 series, has a different "enhanced model-model encoding" than the other Penryn family (Intel Xeon processor 5400 series and several product lines in the Intel Core 2 processor family). The Penryn family has a signature of DisplayFamily = 6 and DisplayModel = 17H (where enhanced model encoding is 1, model encoding is 7). The Dunnington has the same DisplayFamily encoding, the DisplayModel encoding is 1DH.

For Nehalem processor family, which will also span multiple values of DisplayFamily/DisplayModel encodings, please follow up through your contact for the Nehalem prototype system. The same channel can also provide you with access to Tuning information on SSE4.2 and Nehalem microarchitecture.

 

Thanks for the detailed information! I will potentially both be doing some SSE4 optimizations, along with a number of changes to which assembly functions are loaded due to the changes in load latencies and instruction timings on Nehalem.

It sounds like SSE4.2 detection is the best way to go here, as you said there will be a large numbers of models in the Nehalem series with rather complex model/family encoding.

I asked here because the channel through which I have to get information on the prototype system has been rather slow in coming, most likely because I have to go through a number of people to reach the person who has direct contact with Intel (organizational issues on my side I would suspect).

You mentioned "tuning information"; what kind of information falls under this category?  I have already done in-depth analysis using mubench (a full run of --pairs for analysis of execution unit relationships, among other things); what other information is available under that category--uop breakdowns?



Intel Software Network Forums Statistics

8473 users have contributed to 31605 threads and 100654 posts to date.
In the past 24 hours, we have 30 new thread(s) 110 new posts(s), and 160 new user(s).

In the past 3 days, the most popular thread for everyone has been gemm(A,A,A) like possible? The most posts were made to gemm(A,A,A) like possible? The post with the most views is Dear Steve, excuse me for a d

Please welcome our newest member Kevin Johnson