Iterating over pages

Iterating over pages

Hello all,
I am moving forward with my first test application. I finally got it running, however my test benchmark still is "too slow".
The current code runs in 150 [ms] (measured using arbb::scoped_timer, using an arbb::auto_closure created out of the scope) and I expected it to run at least three times faster (~40 [ms]), based on other "high efficiency" implementations.
My guess is that I am still using ArBB wrong.

My current test code looks like this:

void compute_cost_volume(image_data_t input_a, image_data_t input_b, cost_volume_t &cost_volume)
{
_for(arbb::usize d=0, d < cost_volume.num_pages(), d+=1 )
{
cost_slice_2d_t cost_page = cost_volume.page(d);
image_data_t shifted_b = arbb::shift_col(input_b, d);
arbb::map(compute_cost)(input_a, shifted_b, cost_page);
cost_volume = arbb::replace_page(cost_volume, d, cost_page);
} _end_for;

return;
}

where compute_cost is a small operation, image_data_t is dense, 2> and cost_volume_t is dense.

My guess is that the "replace_page trick" is a bad idea. So my questions are:

1) How can I measure/know what is killing my perfomance on this example ? Which is the suggest profiling protocol to follow ? (under linux)

2) I first tried to access cost_page via reference, or replace the _for loop for a arbb::map, but non of these attempts compiled.
The access to each cost_volume page is fully parallel, I would expect then to be able to formulate it in some kind of map construct (or similar). Which is the proper way of doing this ?
In boost::multi_array it is possible to define all kind of views and iterators over a given data volume, I could not find the equivalent in arbb, so I guess another strategy is to be used.

I guess this example would bring some light on the repeated "to loop or not to loop inside arbb::call code" discussion.

Thanks for the answers and the community support.
Best regards,
rodrigob.

5 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.
Portrait de Zhang Z (Intel)

Hi Rodrigob,

Profiling tool for ArBB is still an area being worked on. But you've already guessed it correctly that "replace_page", when used inside a _for loop for large size 3D containers, is a performance killer.

The _for loop is not a parallel loop. Its iterations run sequentially. All the pages in your example are independent and can be updated in parallel. But using _for loop does not exploit the parallelism.

Currently, arbb::map only supports map-over-elements. In other words, arbb::map only takes parameters of scalar types (or array types). So unless all individual elements in the 3D dense can be updated independently, arbb::map does not help in this case. If ArBB had provided something that can do map-over-pages, then things would be much simpler here. Actually, this has been asked before by other customers. Your example once again makes us to think that map-over-pages (and similarly, map-over-cols and map-over-rows) are good things to have. We've been taking notes and will consider providing such functionality in future.

I'll discuss with other ArBB engineers to see what we can do at this time to improve the performance of this particular case. I'll keep you updated.

Thanks,
Zhang

Quoting Zhang Z (Intel)
Currently, arbb::map only supports map-over-elements. In other words, arbb::map only takes parameters of scalar types (or array types).

I'll discuss with other ArBB engineers to see what we can do at this time to improve the performance of this particular case. I'll keep you updated.

Any update on this ?

Regards,
rodrigob.

Portrait de Zhang Z (Intel)

Rodrigob,

I am exploring an idea of using a 1D index container. Each index in the container represents a page in the 3D dense. Then, we can map across these indices. Inside the map kernel function, we use the current index to figure out which page it corresponds to, then we update that page.

I'll try to post a code sample later on.

Thanks,
Zhang

I had though of something similar,this will get rid of the _for loop but how can we avoid the "performance killer" arbb::replace_page ?What is the alternative method to update a page of the cost volume without callingarbb::replace_page ?Regards,rodrigob.

Connectez-vous pour laisser un commentaire.