machine balance in turboboost

machine balance in turboboost

Hi all
May be a stupid question, but
Please tell me. With increasing cores frequency in turboboost mode the Uncore (or may be memory controller or simple memory bandwidth) increases too ?
In other words, the balance between performance CPU and memory subsystem bandwidth worsening or scaling with core frequency ?

if scaled then due to what?

Thanks for your time

14 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Not a stupid question, but the turbo boost is implemented as a change in the clock speed multiplier and doesn't directly impact memory performance.  So, turbo boost is likely to be disabled where performance is memory limited and one doesn't wish to waste power.  On the other hand, turbo boost could save power in the long run by allowing short bursts of enhanced single thread performance but cutting back on idle power consumption.

You may have independent control over RAM clock speed in your BIOS setup, which usually defaults to an auto setting which picks the best all-around clock speed for the installed RAM, and this doesn't change with turbo boost.

thanks for the interesting answer.

Long wanted to clarify for ourselves this question

I think that TurboBoost internal implementation(hardware and microcode level) can use performance counters data to monitor some thread performance and probably when thread is not memory bound and it is cpu bound in such a case TurboBoost can increase the CPU frequency for short period of time.

it is clear, however, CPU bound applications in real life probably no more than applications suitable for accelerators (fine grained parallelism)

In my opinion the bulk of the applications is memory bound

For parallel applications, at least partially memory bound performance is expected, as we can add CPU parallelism less expensively than memory parallelism.

The details of how the uncore behaves under Turbo boost is product-dependent.   

On my Xeon E5-2680 processors I ran a variety of tests to measure the uncore frequency as a function of the frequency of the cores.  From these tests, it appears that the frequency of the uncore is set to match the frequency of the fastest core, except that the uncore only runs at Turbo frequencies when *all* of the cores are running at the maximum Turbo frequency.

The DRAM frequency is not changed in any of these cases, but the memory throughput decreases as the uncore frequency decreases.

On the other hand, I don't see a significant decrease in memory throughput on my Xeon E3-1270 processors as I decrease the CPU core frequencies.  My current interpretation is that the uncore frequency remains fixed on that processor when I change the core frequencies (though I have not checked this explicitly).

John D. McCalpin, PhD
"Dr. Bandwidth"

ie the balance between memory performance and CPU performance nevertheless scales with increasing frequency at turboboost ?

>>> thread is not memory bound and it is cpu bound in such a case >>>

It should have been written that application performance scales lineary with the CPU frequency increment.

its clear

interesting case then application is memory bound or mixture of cpu and memory bound.

For example using objects as primitive data types and perform simple mathematical calculation on them ,then such program will spend more time on dereferencing pointers to objects and walking heap allocated objects than performing math computatons.

Hello all

Maybe it is not s an sense, but would like to share the results and close for yourself this question.

Has appeared to access the server with support turboboost.

Benchmarks (Linpack, NAS parallel benchmark, STREEM). Сhecked the clock speeds from 1.2 to 3.1 (e5-2680) 100MHz increments. The frequency is set in the /sys/devices/system/cpu/cpu*/cpufreq/ .

Benchmarks show linear perfomance scale with the increase in clock frequency.  Incrise perfomance step (perf from Freq2 / perf from Freq1) at small frequencies (1.2) a more rapid acceleration probably due to greater efficiency prefetcher. Incrise perfomance step for all bench than incrise freq in most of the bench about the same.

All of this may indicate that all(or most) processor and memory subsystems the same scaling then frequency incrise (at 1.2 to 3.1 GHz)

As a conclusion - steady work turbobust should seek

Thanks all


Check curr freq turbostat utility and read msr's

Leave a Comment

Please sign in to add a comment. Not a member? Join today