What's the key difference between ArBB1 and ArBB2 in monte-carlo

What's the key difference between ArBB1 and ArBB2 in monte-carlo

Ritratto di ken.kawamoto

Hi, finance/monte-carlo sample gave me the results below on my Fedora 13, Core2 Duo E7400. ---------------------------------------- Version Time(s) Speed Up C 1.440421 1.000 ArBB1 0.122123 11.795 ArBB2 0.517115 2.785 ---------------------------------------- Overall performance of ArBB is excellent and surprising. However I cannot get why ArBB1 is much faster than ArBB2. If I understand correctly, the difference between the two is ArBB1 uses _for construct whereas ArBB2 manipulates dense vectors containing data for all trials at once, which doesn't seem to make such a big difference in performance. Rather, ArBB2 seems to have more opportunity for parallelization as _for is not executed in parallel. Can we have any hints of this difference? or is there a way to see the instructions actually executed after JIT compilation? Thanks, Ken

2 post / 0 new
Ultimo contenuto
Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione
Ritratto di Zhang Z (Intel)

Ken,

It took us some time to investigate this issue. Our conclusion is this is a performance bug. There might be a performance bottleneck in the vectorized code generated in the dynamic compilation. This bottleneck prevents ArBB2 version from running faster than ArBB1 version. We are working on this issue and will get it resolved. Please note that this is an isolated case. In general, we should still expect vector operations offer higher performance than _for loops. You are correct that _for loops are not executed in parallel. It is a good practice to replace _for loops with vector operations or map functions whenever possible.

Accedere per lasciare un commento.