non-optimization of STL iterators

non-optimization of STL iterators

I've found that icc is not matching the performance of g++ in optimization of STL iterators. Here are the obstacles:
1). icc doesn't match the restrict scheme of g++. Simple workaround: use options icc -D__restrict__=restrict -restrictso thatpointers passed to iterators may be declared g++ style with __restrict__ qualifiers. icc optimizations of transform() depend on knowing about non-overlap of objects.
2). STL templates such as inner_product(), transform(), fill(), accumulate(), partial_sum() use != as a termination condition, raising a possibility of ambiguous loop count. icc shuts off optimization for loops which aren't unambiguously countable. There are 2 possible hacks to get around this:
a) supply your own hacked STL, with the condition changed from != to <
b) apply the scheme used by Dinkumware, g++, and STLport to figure out when a copy() is countable. They determine when copy() may be replaced by memmove(). They also replace fill() by memset() when operating on char objects.
icc doesdistinguish when operators are over-loaded to mean something else, so that it doesn't break the code by attempting to vectorize.
3). Effect of #pragma [ivdep|vector always|vector aligned] doesn't penetrate into iterators.

1 post / 0 new
For more complete information about compiler optimizations, see our Optimization Notice.