Strong symbol in shared library overridden by weak symbol in static library

Strong symbol in shared library overridden by weak symbol in static library

Hi,

I have a weired question about strong symbol and weak symbol.

Generally, the weak symbol will be ignored if there is a strong symbol with the same name in presence.

But I found that in some cases, the strong symbol gets overridden by a weak symbol!

The case is that (like in my following test sources):

a strong symbol, foo()  is in a shared library libstrong2.so. But it calls a function, bar(), which is defined in a static library (libweak.a) that also contains a weak symbol definition of foo().

If all libraries are shared (use libweak.so instead of libweak.a), everything is OK, the  symbol  foo() in the final executable is the strong symbol.

But when linking libstrong2.so and libweak.a together, the symbol foo() in the final executable is the weak symbol!

I'm confused because the weak symbol overrides the strong symbol!

Following are the  sources codes and commands used in my test case.

strong1.c:

#include <stdio.h>

void foo ()
{
printf("%s:%s\n", __FILE__, __func__);
}

%icc -fPIC -shared -o libstrong1.so strong1.c
%nm libstrong1.so | grep foo
0000000000000560 T foo

strong2.c:

#include <stdio.h>

extern void bar();

void foo ()
{
bar();
printf("%s:%s\n", __FILE__, __func__);
}

%icc -fPIC -shared -o libstrong2.so strong2.c
%nm libstrong2.so | grep foo
00000000000005b0 T foo
% nm libstrong2.so | grep bar
U bar

weak.c:

#include <stdio.h>

extern void foo() __attribute__((weak));
extern void bar();

void foo ()
{
printf("%s:%s\n", __FILE__, __func__);
}

void bar()
{
printf("%s:%s\n", __FILE__, __func__);
}

%icc -fPIC -c weak.c -o weak.o
%ar cr libweak.a weak.o
%nm libweak.a | grep foo
0000000000000000 W foo
%nm libweak.a | grep bar
0000000000000020 T bar

%icc -fPIC -shared -o libweak.so weak.c
%nm libweak.so | grep foo
0000000000000580 W foo
%nm libweak.so | grep bar
00000000000005a0 T bar

test.c:

extern void foo();

int main (int argc, char ** argv)
{
foo();

return 0;
}

Case 1: ALL SHARED libraries
Remove libweak.a, just libstrong1.so, libstrong2.so and libweak.so
% icc -L. -lstrong1 -lweak test.c
% nm a.out | grep foo
U foo
% ./a.out
strong1.c:foo

% icc -L. -lstrong2 -lweak test.c
% nm a.out | grep foo
U foo
%
% ./a.out
weak.c:bar
strong2.c:foo

case 2: Replace libweak.so with libweak.a

% icc -L. -lstrong1 -lweak test.c
%ldd a.out
libstrong1.so => ./libstrong1.so (0x00002b26b90e2000)
libm.so.6 => /lib64/libm.so.6 (0x00000031c4200000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00000031d3400000)
libc.so.6 => /lib64/libc.so.6 (0x00000031c3e00000)
libdl.so.2 => /lib64/libdl.so.2 (0x00000031c4600000)
libimf.so => /opt/intel/Compiler/11.1/059/lib/intel64/libimf.so (0x00002b26b92f7000)
libsvml.so => /opt/intel/Compiler/11.1/059/lib/intel64/libsvml.so (0x00002b26b968b000)
libintlc.so.5 => /opt/intel/Compiler/11.1/059/lib/intel64/libintlc.so.5 (0x00002b26b98a2000)

%nm a.out | grep foo
U foo
%./a.out
strong1.c:foo

% icc -L. -lstrong2 -lweak test.c
%ldd a.out
libstrong2.so => ./libstrong2.so (0x00002ba45d23a000)
libm.so.6 => /lib64/libm.so.6 (0x00000031c4200000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00000031d3400000)
libc.so.6 => /lib64/libc.so.6 (0x00000031c3e00000)
libdl.so.2 => /lib64/libdl.so.2 (0x00000031c4600000)
libimf.so => /opt/intel/Compiler/11.1/059/lib/intel64/libimf.so (0x00002ba45d44f000)
libsvml.so => /opt/intel/Compiler/11.1/059/lib/intel64/libsvml.so (0x00002ba45d7e3000)
libintlc.so.5 => /opt/intel/Compiler/11.1/059/lib/intel64/libintlc.so.5 (0x00002ba45d9fa000)
/lib64/ld-linux-x86-64.so.2 (0x00000031c3a00000)

%nm a.out | grep foo
00000000004015f0 W foo
%./a.out
weak.c:foo
<----Here foo is defined in weak.c!!!!!!

Any idea?

Regards,

Jie

2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi Jie,

Thank you submiting your issue here. First to note, this issue is not an ICC issue in fact. Same behaviour is for GCC. The "issue" is related to the linker "ld".

Second to note, this is NOT an "issue", the behaviour is correct in fact, you may have misunderstood "weak" symbol. There are several rules of how linker "ld" work, and some rules may come first before another. You need to have more test.

Third, I am not quite familiar with all "ld" rules, but I have do some test and try to get some conclusion to help you understand the issue. I'd like to discuss with you.

I am using your test case and first of all, let's ignore strong1.c as it is same with strong2.c to understand. And also, I will first build all static libraries and dynamic libraries as below (I am using GCC):

 $(cc) -fPIC -shared -o libstrong2.so strong2.c
 $(cc) -fPIC -c strong2.c -o strong2.o
 ar cr libstrong2.a strong2.o
 $(cc) -fPIC -c weak.c -o weak.o
 ar cr libweak.a weak.o
 $(cc) -fPIC -shared -o libweak.so weak.c

Now, I have: libstrong2.a, libstrong2.so, libweak.a, libweak.so. Now, lets test them:

(1) case1: when both libraries are dynamic, the order of libraries matters, the rule seems to be: first libraries(symbols) will be used always

#gcc test.c libweak.so libstrong2.so && ./a.out
weak.c:foo
#gcc test.c libstrong2.so libweak.so && ./a.out
weak.c:bar
strong2.c:foo

Hope you can easily understand what I want to prove from this test. When you use dynamic libraries together, if both libraries have same symbol, the symbol in the first library will be used always.

To be mentioned, in this case, you have NO need to use "weak" attribute in weak.c and you can get SAME results without any linking error. I will explain the reason in the #2 case below.

(2) case2: when both libraries are dynamic.
first, try to understand 'weak' symbol, in fact, Weak definition only and exclusively have a meaning for static archives.
Please read http://sourceware.org/bugzilla/show_bug.cgi?id=3946 for a reference of above rule (Weak definition only and exclusively have a meaning for static archives). For dynamic library, weak is not needed as it will not get 'multipe definiton of same symbol'. This can explain case #1 why with and without 'weak' attribute, you get same results.

by the way, according to the sample in wiki "http://en.wikipedia.org/wiki/Weak_symbol", it is using dynamic library, I am not sure which one is correct. As 'weak' symbol is not definied by C++ standard, it may be totally depends on the implementation of GNU ld.

Let's first build without "weak" attribute, and the result is as below:
#gcc test.c libweak.a libstrong2.a && ./a.out
weak.c:foo
#gcc test.c libstrong2.a libweak.a && ./a.out
libweak.a(weak.o): In function `foo':
weak.c:(.text+0x0): multiple definition of `foo'
libstrong2.a(strong2.o):strong2.c:(.text+0x0): first defined here
collect2: ld returned 1 exit status
#

We can see, the order of library STILL matters for static library. When libweak.a comes first, it will use the symbol "foo" in libweak.a, as all symbols are resolved, libstrong2.a will not affect. But if "libstrong2.a" comes first, it will try to use symbol "foo" in libstrong2.a, and libstrong2.a contains an external symbol "bar", it will continue to try to resolve it in other libraries (here is libweak.a), and then it will report an error because it finds "foo" is defined again (multiple definition). In this case, "weak" attribue can help, with adding 'weak' attribuite in code, we get below results:

#gcc test.c libweak.a libstrong2.a && ./a.out
weak.c:foo
#gcc test.c libstrong2.a libweak.a && ./a.out
weak.c:bar
strong2.c:foo
#

From the results, we can see, it can work now and the strong symbol is used. But it is not true that "if symbols are defined in multiple libraries, strong symbols will override weak symbols". In fact, it depends on the library order. The meaning of "weak" symbol is, when no other symbol is defined, its value will be zero, but does not mean a weak symbol will always be override by strong symbol. If weak symbol comes first, strong symbol will not be used still. So, the rule #1 works here too.
The usage of 'weak' here, is try to let the linker to work without reporting the multiple definition error.

(3) case3: when static libraries and dynamic libraries are used together (have same symbols in them).

#gcc test.c libstrong2.a libweak.so && ./a.out
weak.c:bar
strong2.c:foo
#gcc test.c libstrong2.so libweak.a && ./a.out
weak.c:foo
#gcc test.c libweak.a libstrong2.so && ./a.out
weak.c:foo
#gcc test.c libweak.so libstrong2.a && ./a.out
weak.c:foo
#

To be mentioned again, in this case, you have NO need to use "weak" attribute in weak.c and you can get SAME results without any linking error. So the issue your reported is not related to 'weak' attribute, as it can get same results with or without 'weak' in code.
My understanding of above results is, first linker will try to resovle the symbols according to the order (rule #1), when all the symbols are resolved, it will not continue to search in other libraries (staic or dynamic). But if not all symbols are resolved, it will continue to resolve in other libraries. And if previous resolved symbol is found again in static library, it will override the one in dynamic.
see below:
gcc test.c libstrong2.a libweak.so && ./a.out -> first try to resolve "foo" in libstrong2.a, as "bar" is needed, it will find it in "libweak.so".
gcc test.c libstrong2.so libweak.a && ./a.out -> same as above, but as libweak.a is static, it will override one in dynamic.
gcc test.c libweak.a libstrong2.so && ./a.out -> first try to resolve "foo" in libweak.a, as it is resolved, it will be used.
gcc test.c libweak.a libstrong2.so && ./a.out -> first try to resolve "foo" in libweak.a, as it is resolved, it will be used.

Summary:
This issue is related to how 'ld' will work. I am not expert of linker 'ld'. Above are my observation and analysis for your reference. You may have more test and if possible, try to contact GNU ld developer to know details.

Login to leave a comment.