MIC Core Frequency

MIC Core Frequency

DubitoCogito的头像

I have been trying different methods of timing subroutines in order to measure performance. Since I am calling an MKL method, I decided to try mkl_get_cpu_clocks() which as the name suggests returns elaspsed CPU clock ticks. Of course, I also need to know the core frequency to convert the measurement to seconds so I call the mkl_get_cpu_frequency() function. However, I noticed that although the initial speed reported is the expected 1.05 GHz, the value drops once the computational kernel begins executing. Here are the approximate frequencies being reported for a given number of threads per core: 2 threads 1.0 GHz, 3 threads 0.9 GHz, and 4 threads 0.8 GHz. Are the core frequencies in fact being dynamically adjusted?

11 帖子 / 0 new
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项
Sumedh Naik (Intel)的头像

We have noticed this type of behaviour as well and are currently investigating the cause. Will get back to you when I have some more information. 

Sumedh Naik (Intel)的头像

There are two possible reasons for this behaviour:

1) Power Overload

The Intel Xeon Phi coprocessor has two power levels that are put in place to protect the coprocessor and system from a power overload. The coprocessor are constantly measuring power and if the power exceeds 105% of design (the levels are programmable), then the coprocessor will throttle down about 100 Mhz. If the coprocessor  detects power over 125%, it will go to the minimum supported frequency, which is 800 MHz.What you are seeing could be this, however there aren't any known workloads that will take a coprocessor to 125% of design power levels. 

2) Thermal heating

It is more likely that the coprocessor is overheating and when the junction temperature hits 104 degrees C, the frequency drops to 800 MHz in an attempt to 1) preserve data by not crashing, and 2) cool by running slower.

 I'll try to get back with more information on how to control this behaviour. 

-Sumedh

Sumedh Naik (Intel)的头像

It would help us root cause the problem if you could post the output of /opt/intel/mic/bin/micinfo and the version of your compiler. 

DubitoCogito的头像

I am using Intel Composer XE v2013.1.117 which includes ICC v13.0.1 Build 20121010. I have attached a copy of the output from micinfo.

附件: 

附件尺寸
下载 mic-info.txt3.93 KB
Sumedh Naik (Intel)的头像

After talking to the experts, this is what I learnt: 

The 5000 series coprocessor you have, has two power options: If you plug in the 2x4 as well as the 2x3 power connectors, the coprocessor can run at up to 245W and should be able run any workload without throttling. If only the 2x4 power connector is plugged in, the coprocessor configures itself for a maximum of 225W and one or two highly tuned benchmarks can cause it to throttle. Also, if the 5000 series coprocessor begins to power throttle, the speed will reduce to 947MHz. Then it will drop to 632 MHz if there is a thermal event or a major power overload. However, your numbers don't match up with the expectations. 

Could you please verify that you are checking the frequency on the coprocessor? Also, could you try another method to check the speed: When logged into the coprocessor, cat /proc/cpuinfo 

DubitoCogito的头像

I am using the mkl_get_cpu_frequency() function to query the current CPU frequency, which ranges from 0.80 to 0.83 GHz when running DGEMM with 240 threads. Before running the solver loop the various available MKL functions reported the following:

   1.05 (mkl_get_max_cpu_frequency) : 1.05 (mkl_get_cpu_frequency) : 1.05 (mkl_get_clocks_frequency)

I then called two of the functions after each loop iteration and got something like the following:

   0.81 (mkl_get_cpu_frequency) : 1.05 (mkl_get_clocks_frequency)

I also checked /proc/cpuinfo immediately after the run and it showed the expected frequency of 1052.63 MHz for all 60 cores. Am I misusing mkl_get_cpu_frequency() or is it returning the wrong number?

Thank you for your help.

DubitoCogito的头像

As an additional sanity check I built cpufrequtils for the MIC and it also reported an operating frequency of 1.05 GHz for all cores.

Sumedh Naik (Intel)的头像

MKL provides a different function, mkl_get_clocks_frequency(), to return a constant value even if CPU frequency changes. This function is querying the constant-rate time stamp counter. Could you try mkl_get_cpu_frequency() instead? You can find the detailed documentation here: http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mklman/hh_goto.htm#GUID-B87943A0-5706-457E-AFE1-0E35E8E6376B.htm

You can also use MKL dsecnd() function (which returns elapsed time in seconds) for timing. This is more convenient because it does the conversion from ticks to seconds for you. Note dsecnd() calls mkl_get_clocks_frequency() internally.

Sumedh Naik (Intel)的头像

I just got word from the MKL developers. This is a bug in the mkl_get_cpu_frequency(). The developers are working on resolving this issue. 

DubitoCogito的头像

Thanks for your help and the update.

登陆并发表评论。