Optimal vector length

Optimal vector length

I have a problem that uses a lot of basic vector operations between 2 vectors lets say x and y (type 4 or type 8 reals). Given that I can choose the length of those 2 vectors, I would like to know what would be the most efficient length (I guess one that will be evenly divisible by the length of the cache) but I am not really sure.


2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Now that the site is accepting replies (grudgingly): the considerations depend on your choice of architecture. On P4/Netburst systems, most important is to have the vectors start at 16-byte-aligned addresses. Then you can begin to deal with making your loop lengths a multiple of 8, in case they involve unrolling by that factor. Beyond that, you could search for the greatest length which does not begin to incur more L2 cache misses (smaller with HyperThreading in use). Look for data placement which reduces cache mapping conflicts (e.g. 64k aliasing); this probably contradicts your idea about relation to cache length.

Leave a Comment

Please sign in to add a comment. Not a member? Join today