Optimal vector length

Optimal vector length

I have a problem that uses a lot of basic vector operations between 2 vectors lets say x and y (type 4 or type 8 reals). Given that I can choose the length of those 2 vectors, I would like to know what would be the most efficient length (I guess one that will be evenly divisible by the length of the cache) but I am not really sure.


2 Beiträge / 0 neu
Letzter Beitrag
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.

Now that the site is accepting replies (grudgingly): the considerations depend on your choice of architecture. On P4/Netburst systems, most important is to have the vectors start at 16-byte-aligned addresses. Then you can begin to deal with making your loop lengths a multiple of 8, in case they involve unrolling by that factor. Beyond that, you could search for the greatest length which does not begin to incur more L2 cache misses (smaller with HyperThreading in use). Look for data placement which reduces cache mapping conflicts (e.g. 64k aliasing); this probably contradicts your idea about relation to cache length.

Kommentar hinterlassen

Bitte anmelden, um einen Kommentar hinzuzufügen. Sie sind noch nicht Mitglied? Jetzt teilnehmen