lfetch- instruction is key to achieve good performance...

lfetch- instruction is key to achieve good performance...

"lfetch- instruction is key to achieve good performance...", Intel optimization manuals tell.

Some consequences (and problems?) follow, I think.

Software design fundamentals as algorithms and data structures are out of date (a kind)?

(1) Data structures: We have used to understand that data elements can and
should be placed into structures based on logical level (application not HW logic) reasons only. Now, mixing floats and integers in a structure is an efficiency penalty (lfetch), if we lfetch roughly based on the number of bytes in a structure mixing floats and integers.

(2) Algorithms: The code based on linked lists and hashing is a nice idea, at
least popular.
Itanium loves sequential order (lfetch efficiency). We can't easily change
algorithms to be efficient "for Itanium way of working". Compilers can't change
algorithms even can easier (not too easy) handle mixed data structures.

Are compilers ready to generate truly efficient lfetches now , if not when?
What is the status of gcc?
Good enough for kernel special efficiency needs?

What about porting an old SW and it's efficiency? (data structures, algorithms)
A new SW is easier, but presents a human problem, to change an old way of
thinking. We have learnt a very long time to believe that HW is serving us, we
are not supposed to change our way of thinking based on a current HW-

I did seek my 2.4.19 Linux kernel ( find . -path "*ia64*" | xargs grep lfetch).
The "entry.S" has 6 lfetches (enough for the whole Linux?).
(The "processor.h" file has a inline function for lfetch but is not truly used? Is it?)

How to organize at a practical level lfetches in a small new fast kernel?
Some of the kernel main control data fetched on entry into kernel? The data organized for that at a certain way? Groups?
Fetching step by step- if we enter to sem_wait for example, we fetch semaphore
data and so on...?
Data in sequential tables if possible?
We fetch backing store data before loadrs?

I did end up into these questions, when I developed a simple tutorial "nano- sized" kernel for Itanium (http://www.evitech.fi/~tk/wbs/README.html)

e-mail: tk(at)evitech.fi
ULR: http://www.evitech.fi/~tk/,

2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

even having found 6 onlyprefetch instructions we should remember that this instruction is just a hint, so less or equal 6

Leave a Comment

Please sign in to add a comment. Not a member? Join today