Recently I have got a workload that could not scale beyond a few cores. This particular application is using one thread per user, so theoretically, if one has an 8-core machine then 8 concurrent users should fully utilize the machine giving 8x speedup compared to a sequential run.
It did not happen. At most two cores have been utilized, the query throughput speedup was even smaller.
It did not happen. At most two cores have been utilized, the query throughput speedup was even smaller.
