Subset of dense container

Subset of dense container

I have written function, I want to compute subset of dense container.
I just added a if condition.
But it spends a little more time than non-optimized "linear" version.
Is there any way to optimize\\parallelize this function?

void computemap(f32& result,f32 c1, f32 u1){
	usize i;
	position(i);

	_if ( i / x_shift >= 1 && i / x_shift < (padded_y_size - 1)
		&& (i % x_shift) > y_shift && (i % x_shift) < y_shift * (padded_y_size - 1)
		&& (i % x_shift) % y_shift >= 1 && (i % x_shift) % y_shift < (padded_y_size - 1))
	{	
		result += c1 * u1;
	}_end_if
}

void compute(dense& result, f32 c1, dense u1){
	map(computemap)(result,c1,u1);
}

3 posts / novo 0
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.

Hi, I have a few pieces of advice.

First, take a look at the many Knowledge Base articles we have for code tips.

Next, take a look at my article about Things to Consider.

It is understandable that your code would be slower than the serial version for the following reasons

-The _if is inlined and is not parallelized.
-There is overheadin yourfirst invocation of the ArBB function due to the JIT compile
- Even for runs2 through X, this ArBB function really isn't doing any majorcomputation to justify scaling across multiple cores.

I just asked one of my colleagues about specific code tips for your situation, and luckily we have an entire presentation you can look through for exact code snippets.Go to http://software.intel.com/file/34410 and starting with slide #22 it will address the subset of dense container.

Faça login para deixar um comentário.