Most benchmark service results apparently only use 1 core - how to fix?

Most benchmark service results apparently only use 1 core - how to fix?


we have an issue with the benchmark service:

in most benchmark reports our code was apparently run on only one core, even though it is parallelized (and runs so on our own workstations).

We are using tbb and parallel_for with a grainsize that divides the input range into exactly #workerThreads many subranges. All reports come back with 99% CPU usage. (And it's the same when we increase the thread:task ratio to 1:50.)

However, I submitted the same code this morning, too, and it came back with 1000% CPU usage.

Are multiple benchmarks run at the same time? And if so, why are we losing out against the others?


2 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.

Okay, what follows is a bit embarrassing, but nonetheless: ^^

Everything works alright with the benchmarking service. My grainsize calculation caused the problem.

Our test data uses big input sequences and small references sequences and I hadn't written anything yet to swap the order if the sequence lengths are the other way around.

The human genome example uses a huge a reference sequence and small input sequences, so other benchmarks probably do, too.

I've adapted the code and now it works just fine.


Connectez-vous pour laisser un commentaire.