Finding Number of Cores/Threads

Finding Number of Cores/Threads

Bild des Benutzers David DiLaura

Colleagues,While experimenting with upgrading some commercial engineering software for multi-core/thread operation, it has become clear that I need to detect the number of cores, NOT including hyperthreads. This is true if I use local threading with OpenMP or multiple images with Coarray. Is there a (simple) way to detect the number of cores only?  Or at least recognize that hyperthreading is 'on' and account for it. I found some old material on this topic in this forum from 2006. It there more recent information/methods?   Clients will be using XP SP3 and Win7.David

16 Beiträge / 0 neu
Letzter Beitrag
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.
Bild des Benutzers Steve Lionel (Intel)

I think that the Windows API routine GetLogicalProcessorInformation will do what you want. There's even a C++ example there which should guide you.

Steve
Bild des Benutzers Steve Lionel (Intel)

FWIW, I've started to translate that example into Fortran, so if you can wait a few days, I may have a worked example for you.

Steve
Bild des Benutzers David DiLaura

Steve, A translation would be very, very helpful! I'm not proficient enough in C++ to do it without much painful stumbling around. BTW: what/who governs the use of cores and hyperthreads? That is, are hyperthreads used only after all physical cores have been used?  If I have 4 cores and limit my OpenMP code to 4 threads or limit my Coarray process to 4 images, are the cores used 'first'?  Or (if hyperthreading is on) are 2 cores and 2 hyperthreads used? David

Bild des Benutzers Steve Lionel (Intel)

The OS schedules threads on the Hyperthread execution units as if they were separate cores. However, Windows 7 and newer versions of Windows do recognize the distinction and use it when deciding on which logical processor to schedule a thread on.  It knows that it is "cheaper" to keep a thread on the same physical core than to migrate a thread to another core.  But in general, a processor with two cores and two threads per core is treated as a four-core processor.  The idea of Hyperthreads is that there are typically unused functional units in each core that can be used by a second thread.  Initial implementations of this gave mixed results, but more recent Intel processors do a better job and keeping Hyperthreading on is usually a benefit. It should be turned off when you are running single-thread CPU-intensive applications.

Steve
Bild des Benutzers Tim Prince

Intel MPI runtime recognizes hyperthreads and distributes processes across cores so that you should not normally need to adjust your BIOS settings to disable hyperthreading.
OpenMP by itself depends on your setting of affinity (e.g. KMP_AFFINITY environment variable) to make efficient use of multiple CPUs or hyperthreading.

Bild des Benutzers Steve Lionel (Intel)

Ok, here's a Fortran version of that MSDN sample. I think it looks a lot cleaner in Fortran!  Some things to note...

First, I use a Fortran 2008 feature, the module intrinsic function C_SIZEOF.  This is not implemented in the "2011" (12.1) compiler, but is in the next version (due out next week.)  You can substitute SIZEOF for now.  I wanted this to be as pure standard Fortran as I could make it.

Second, the declarations for the Windows APIs used are not in the supplied modules.  I have attached a module that defines these and will see about getting them into the product.

Please try it out and let me know how it works for you.  As an example, when I ran it on my "Nehalem" system, I get:

GetLogicalProcessorInformation results:
  Number of NUMA nodes: 1
  Number of physical processor packages: 1
  Number of processor cores: 4
  Number of logical processors: 8
  Number of processor L1/L2/L3 caches: 8/4/1

Anlagen: 

Steve
Bild des Benutzers David DiLaura

Steve, Many thanks; the code works fine. Running on my Dell laptop workstation is reports: GetLogicalProcessorInformation results:   Number of NUMA nodes: 1   Number of physical processor packages: 1   Number of processor cores: 4   Number of logical processors: 8   Number of processor L1/L2/L3 caches: 8/4/1 I'm sure this will help a lot of folks at this forum. David

Bild des Benutzers Pawel Matuszyk

I cannot download these files... Where could I find them?

Bild des Benutzers Steve Lionel (Intel)

Looks as if they got lost in the transition to the new forum software. I'll repost them on Monday. The additions to KERNEL32 were put in to 2013 Update 1 (I am pretty sure about this, if not, then update 2), and I'll add the example to the Samples for Update 2.

Steve
Bild des Benutzers Steve Lionel (Intel)

I have replaced the attachments. The "kernel32_additions" module is not needed as of update 1 (13.0.119) - I checked.

Steve
Bild des Benutzers Andrew Smith

That works a treat. Attached subroutine version which I am calling from Smalltalk

Bild des Benutzers Andrew Smith

Now its attached !

Anlagen: 

AnhangGröße
Herunterladen getprocessorinfo.f905.66 KB
Bild des Benutzers Steve Lionel (Intel)

Very good. You pass the character length separately from Smalltalk?

Steve
Bild des Benutzers Andrew Smith

Yes, I pass the integer array by reference and the character length by value. It worked first time so not seen any errors yet.

Bild des Benutzers Steve Lionel (Intel)

Ok, just checking, as the code you posted requires the length to be passed separately by value. I assume that your Smalltalk is 32-bit only.

Steve

Melden Sie sich an, um einen Kommentar zu hinterlassen.