pack() seems relatively expensive

pack() seems relatively expensive

Compared to

pack( source, mask )

... I get better timings for

split( source, select( mask, isize(1), isize(0) )).segment( usize(1) )

... i.e. it is faster to sort "in place" and keep onlythat whichyou need. And much faster if you want both parts!For example

t1 = pack( source, mask );
t2 = pack( source, !mask);

... versus

t0 = split( source, select( mask, isize(1), isize(0) ));
t1 = t0.segment( usize(1) );
t2 = t0.segment( usize(0) );

Just FYI,
- paul

3 post / 0 nuovi
Ultimo contenuto
Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione

Thank you very much for figuring this out! This is really useful, especially your comment on "both parts". We are taking steps within this Beta program to continue making Intel ArBB a productive, portable, and performant programming model. We definitely address those kinds of experiences in first place.

Unexpectedly, I'm seeing some commonality or analogy between ArBB and IBM punch card (aka Hollerith) computing from the 1950s, '60s, and '70s.
See: http://en.wikipedia.org/wiki/Unit_record_equipment

I find it poetic (and/or ironic) that leading-edge, scientific, efficient & relatively fine-grained, parallel processing (ArBB) should share many of the same qualities with card sorters and tabulators. Is it time to review algorithmspublished inJACM?

- paul

Lascia un commento

Eseguire l'accesso per aggiungere un commento. Non siete membri? Iscriviti oggi