Getting icpc to inline std::string member functions

Getting icpc to inline std::string member functions

I wonder how to get icpc to inline member functions of std::string. It seems that when libstdc++ contains an implementation of such functions, icpc will use that implementation instead of inlining even for simplest functions like std::string::begin().

8 posts / novo 0
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.

There are lots of methods in std::string class and I'll check the following:
...
empty()
size()
insert()
operator[]
...

 

The libstdc++ are designed (to speed up compile time)
to prevent users from doing any of the "common" template
instantiations, instead they are supposed to be linked in
from the library.

To accomplish this you'll see these kind of extern template
directives in the GNU header files:

  // Inhibit implicit instantiations for required instantiations,
  // which are defined via explicit instantiations elsewhere.
#if _GLIBCXX_EXTERN_TEMPLATE > 0
  extern template class basic_string<char>;
...

If you want to instantiate and inline these template functions yourself,
you'll need to override the value of _GLIBCXX_EXTERN_TEMPLATE, so
you could do something like:

#include <bits/c++config.h>
#undef _GLIBCXX_EXTERN_TEMPLATE
#define _GLIBCXX_EXTERN_TEMPLATE 0 // override value
#include <string>

But don't be surprised if your compile time increases significantly.

Judy

Thank you, Judy and I think #undef for _GLIBCXX_EXTERN_TEMPLATE should work because the #if statement checks value of the macro for greater than zero:
...
#if _GLIBCXX_EXTERN_TEMPLATE > 0
...

I was comparing icc and gcc. gcc 4.7 inlines those small functions but does not leave separate instantiated copies of them in the object file. Maybe icc should just do the same. It makes too little not to inline those functions.

 

Can you please give an example (a test case) of a routine that GNU 4.7 inlines but icpc does not? And what is your evidence that GNU is inlining it?

thanks,

Judy

Quote:

Judith Ward (Intel) wrote:

 

Can you please give an example (a test case) of a routine that GNU 4.7 inlines but icpc does not? And what is your evidence that GNU is inlining it?

thanks,

Judy

$ cat test.cc
#include <string>
#include <cstdio>
using namespace std;
int main() {
    string s("aaaa");
    printf("%lun", s.length() + s.size() + s[0]);
}
$ icpc -c test.cc -O2
$ nm -C test.o
                 U _Unwind_Resume
                 U std::string::size() const
                 U std::string::length() const
                 U std::allocator<char>::allocator()
                 U std::allocator<char>::~allocator()
                 U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)
                 U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()
                 U std::string::operator[](unsigned long)
                 U __gxx_personality_v0
                 U __intel_new_proc_init
0000000000000000 T main
0000000000000000 r main$$LSDA
                 U printf
$ g++ -c test.cc -O2
$ nm -C test.o
                 U _Unwind_Resume
                 U std::string::_M_leak()
                 U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)
                 U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()
                 U __gxx_personality_v0
0000000000000000 T main
                 U printf
$ objdump -S test.o        
test.o:     file format elf64-x86-64
Disassembly of section .text.startup:
0000000000000000 <main>:
   0:    53                       push   %rbx
   1:    be 00 00 00 00           mov    $0x0,%esi
   6:    48 83 ec 20              sub    $0x20,%rsp
   a:    48 8d 7c 24 10           lea    0x10(%rsp),%rdi
   f:    48 8d 54 24 0f           lea    0xf(%rsp),%rdx
  14:    e8 00 00 00 00           callq  19 <main+0x19>       <---- string ctor
  19:    48 8b 44 24 10           mov    0x10(%rsp),%rax
  1e:    48 8d 7c 24 10           lea    0x10(%rsp),%rdi
  23:    48 8b 58 e8              mov    -0x18(%rax),%rbx
  27:    48 01 db                 add    %rbx,%rbx
  2a:    e8 00 00 00 00           callq  2f <main+0x2f>       <---- string::_M_leak
  2f:    48 8b 44 24 10           mov    0x10(%rsp),%rax
  34:    bf 00 00 00 00           mov    $0x0,%edi
  39:    48 0f be 30              movsbq (%rax),%rsi
  3d:    31 c0                    xor    %eax,%eax
  3f:    48 01 de                 add    %rbx,%rsi
  42:    e8 00 00 00 00           callq  47 <main+0x47>       <---- printf
  47:    48 8d 7c 24 10           lea    0x10(%rsp),%rdi
  4c:    e8 00 00 00 00           callq  51 <main+0x51>       <---- string dtor
  51:    48 83 c4 20              add    $0x20,%rsp
  55:    31 c0                    xor    %eax,%eax
  57:    5b                       pop    %rbx
  58:    c3                       retq   
  59:    48 8d 7c 24 10           lea    0x10(%rsp),%rdi
  5e:    48 89 c3                 mov    %rax,%rbx
  61:    e8 00 00 00 00           callq  66 <main+0x66>       <---- string dtor
  66:    48 89 df                 mov    %rbx,%rdi
  69:    e8 00 00 00 00           callq  6e <main+0x6e>       <---- _Unwind_Resume
$

Quote:

Judith Ward (Intel) wrote:

 

Can you please give an example (a test case) of a routine that GNU 4.7 inlines but icpc does not? And what is your evidence that GNU is inlining it?

thanks,

Judy

$ cat test.cc
#include <string>
#include <cstdio>
using namespace std;
int main() {
    string s("aaaa");
    printf("%lun", s.length() + s.size() + s[0]);
}
$ icpc -c test.cc -O2
$ nm -C test.o
                 U _Unwind_Resume
                 U std::string::size() const
                 U std::string::length() const
                 U std::allocator<char>::allocator()
                 U std::allocator<char>::~allocator()
                 U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)
                 U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()
                 U std::string::operator[](unsigned long)
                 U __gxx_personality_v0
                 U __intel_new_proc_init
0000000000000000 T main
0000000000000000 r main$$LSDA
                 U printf
$ g++ -c test.cc -O2
$ nm -C test.o
                 U _Unwind_Resume
                 U std::string::_M_leak()
                 U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)
                 U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()
                 U __gxx_personality_v0
0000000000000000 T main
                 U printf
$ objdump -S test.o        
test.o:     file format elf64-x86-64
Disassembly of section .text.startup:
0000000000000000 <main>:
   0:    53                       push   %rbx
   1:    be 00 00 00 00           mov    $0x0,%esi
   6:    48 83 ec 20              sub    $0x20,%rsp
   a:    48 8d 7c 24 10           lea    0x10(%rsp),%rdi
   f:    48 8d 54 24 0f           lea    0xf(%rsp),%rdx
  14:    e8 00 00 00 00           callq  19 <main+0x19>       <---- string ctor
  19:    48 8b 44 24 10           mov    0x10(%rsp),%rax
  1e:    48 8d 7c 24 10           lea    0x10(%rsp),%rdi
  23:    48 8b 58 e8              mov    -0x18(%rax),%rbx
  27:    48 01 db                 add    %rbx,%rbx
  2a:    e8 00 00 00 00           callq  2f <main+0x2f>       <---- string::_M_leak
  2f:    48 8b 44 24 10           mov    0x10(%rsp),%rax
  34:    bf 00 00 00 00           mov    $0x0,%edi
  39:    48 0f be 30              movsbq (%rax),%rsi
  3d:    31 c0                    xor    %eax,%eax
  3f:    48 01 de                 add    %rbx,%rsi
  42:    e8 00 00 00 00           callq  47 <main+0x47>       <---- printf
  47:    48 8d 7c 24 10           lea    0x10(%rsp),%rdi
  4c:    e8 00 00 00 00           callq  51 <main+0x51>       <---- string dtor
  51:    48 83 c4 20              add    $0x20,%rsp
  55:    31 c0                    xor    %eax,%eax
  57:    5b                       pop    %rbx
  58:    c3                       retq   
  59:    48 8d 7c 24 10           lea    0x10(%rsp),%rdi
  5e:    48 89 c3                 mov    %rax,%rbx
  61:    e8 00 00 00 00           callq  66 <main+0x66>       <---- string dtor
  66:    48 89 df                 mov    %rbx,%rdi
  69:    e8 00 00 00 00           callq  6e <main+0x6e>       <---- _Unwind_Resume
$

Deixar um comentário

Faça login para adicionar um comentário. Não é membro? Inscreva-se hoje mesmo!