To explain what's going on here, let's talk first about your original source files, with
a.h (1):
void foo() __attribute__((weak));
and:
a.c (1):
#include "a.h"
#include <stdio.h>
void foo() { printf("%s
", __FILE__); }
The mixture of .c
and .cpp
files in your sample code is irrelevant to the
issues, and all the code is C, so we'll say that main.cpp
is main.c
and
do all compiling and linking with gcc
:
$ gcc -Wall -c main.c a.c b.c
ar rcs a.a a.o
ar rcs b.a b.o
First let's review the differences between a weakly declared symbol, like
your:
void foo() __attribute__((weak));
and a strongly declared symbol, like
void foo();
which is the default:
When a weak reference to foo
(i.e. a reference to weakly declared foo
) is linked in a program, the
linker need not find a definition of foo
anywhere in the linkage: it may remain
undefined. If a strong reference to foo
is linked in a program,
the linker needs to find a definition of foo
.
A linkage may contain at most one strong definition of foo
(i.e. a definition
of foo
that declares it strongly). Otherwise a multiple-definition error results.
But it may contain multiple weak definitions of foo
without error.
If a linkage contains one or more weak definitions of foo
and also a strong
definition, then the linker chooses the strong definition and ignores the weak
ones.
If a linkage contains just one weak definition of foo
and no strong
definition, inevitably the linker uses the one weak definition.
If a linkage contains multiple weak definitions of foo
and no strong
definition, then the linker chooses one of the weak definitions arbitrarily.
Next, let's review the differences between inputting an object file in a linkage
and inputting a static library.
A static library is merely an ar
archive of object files that we may offer to
the linker from which to select the ones it needs to carry on the linkage.
When an object file is input to a linkage, the linker unconditionally links it
into the output file.
When static library is input to a linkage, the linker examines the archive to
find any object files within it that provide definitions it needs for unresolved symbol references
that have accrued from input files already linked. If it finds any such object files
in the archive, it extracts them and links them into the output file, exactly as
if they were individually named input files and the static library was not mentioned at all.
With these observations in mind, consider the compile-and-link command:
gcc main.c a.o b.o
Behind the scenes gcc
breaks it down, as it must, into a compile-step and link
step, just as if you had run:
gcc -c main.c # compile
gcc main.o a.o b.o # link
All three object files are linked unconditionally into the (default) program ./a.out
. a.o
contains a
weak definition of foo
, as we can see:
$ nm --defined a.o
0000000000000000 W foo
Whereas b.o
contains a strong definition:
$ nm --defined b.o
0000000000000000 T foo
The linker will find both definitions and choose the strong one from b.o
, as we can
also see:
$ gcc main.o a.o b.o -Wl,-trace-symbol=foo
main.o: reference to foo
a.o: definition of foo
b.o: definition of foo
$ ./a.out
b.c
Reversing the linkage order of a.o
and b.o
will make no difference: there's
still exactly one strong definition of foo
, the one in b.o
.
By contrast consider the compile-and-link command:
gcc main.cpp a.a b.a
which breaks down into:
gcc -c main.cpp # compile
gcc main.o a.a b.a # link
Here, only main.o
is linked unconditionally. That puts an undefined weak reference
to foo
into the linkage:
$ nm --undefined main.o
w foo
U _GLOBAL_OFFSET_TABLE_
U puts
That weak reference to foo
does not need a definition. So the linker will
not attempt to find a definition that resolves it in any of the object files in either a.a
or b.a
and
will leave it undefined in the program, as we can see:
$ gcc main.o a.a b.a -Wl,-trace-symbol=foo
main.o: reference to foo
$ nm --undefined a.out
w __cxa_finalize@@GLIBC_2.2.5
w foo
w __gmon_start__
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
U __libc_start_main@@GLIBC_2.2.5
U puts@@GLIBC_2.2.5
Hence:
$ ./a.out
no foo
Again, it doesn't matter if you reverse the linkage order of a.a
and b.a
,
but this time it is because neither of them contributes anything to the linkage.
Let's turn now to the different behavior you discovered by changing a.h
and a.c
to:
a.h (2):
void foo();
a.c (2):
#include "a.h"
#include <stdio.h>
void __attribute__((weak)) foo() { printf("%s
", __FILE__); }
Once again:
$ gcc -Wall -c main.c a.c b.c
main.c: In function ‘main’:
main.c:4:18: warning: the address of ‘foo’ will always evaluate as ‘true’ [-Waddress]
int main() { if (foo) foo(); else printf("no foo
"); }
See that warning? main.o
now contains a strongly declared reference to foo
:
$ nm --undefined main.o
U foo
U _GLOBAL_OFFSET_TABLE_
so the code (when linked) must have a non-null address for foo
. Proceeding:
$ ar rcs a.a a.o
$ ar rcs b.a b.o
Then try the linkage:
$ gcc main.o a.o b.o
$ ./a.out
b.c
And with the object files reversed:
$ gcc main.o b.o a.o
$ ./a.out
b.c
As before, the order makes no difference. All the object files are linked. b.o
provides
a strong definition of foo
, a.o
provides a weak one, so b.o
wins.
Next try the linkage:
$ gcc main.o a.a b.a
$ ./a.out
a.c
And with the order of the libraries reversed:
$ gcc main.o b.a a.a
$ ./a.out
b.c
That does make a difference. Why? Let's redo the linkages with diagnostics:
$ gcc main.o a.a b.a -Wl,-trace,-trace-symbol=foo
/usr/bin/x86_64-linux-gnu-ld: mode elf_x86_64
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/crti.o
/usr/lib/gcc/x86_64-linux-gnu/7/crtbeginS.o
main.o
(a.a)a.o
libgcc_s.so.1 (/usr/lib/gcc/x86_64-linux-gnu/7/libgcc_s.so.1)
/lib/x86_64-linux-gnu/libc.so.6
(/usr/lib/x86_64-linux-gnu/libc_nonshared.a)elf-init.oS
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
libgcc_s.so.1 (/usr/lib/gcc/x86_64-linux-gnu/7/libgcc_s.so.1)
/usr/lib/gcc/x86_64-linux-gnu/7/crtendS.o
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/crtn.o
main.o: reference to foo
a.a(a.o): definition of foo
Ignoring the default libraries, the only object files of ours that get
linked were:
main.o
(a.a)a.o
And the definition of foo
was taken from the archive member a.o
of a.a
:
a.a(a.o): definition of foo
Reversing the library order:
$ gcc main.o b.a a.a -Wl,-trace,-trace-symbol=foo
/usr/bin/x86_64-linux-gnu-ld: mode elf_x86_64
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/crti.o
/usr/lib/gcc/x86_64-linux-gnu/7/crtbeginS.o
main.o
(b.a)b.o
libgcc_s.so.1 (/usr/lib/gcc/x86_64-linux-gnu/7/libgcc_s.so.1)
/lib/x86_64-linux-gnu/libc.so.6
(/usr/lib/x86_64-linux-gnu/libc_nonshared.a)elf-init.oS
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
libgcc_s.so.1 (/usr/lib/gcc/x86_64-linux-gnu/7/libgcc_s.so.1)
/usr/lib/gcc/x86_64-linux-gnu/7/crtendS.o
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/crtn.o
main.o: reference to foo
b.a(b.o): definition of foo
This time the object files linked were:
main.o
(b.a)b.o
And the definition of foo
was taken from b.o
in b.a
:
b.a(b.o): definition of foo
In the first linkage, the linker had an unresolved strong reference to
foo
for which it needed a definition when it reached a.a
. So it
looked in the archive for an object file that provides a definition,
and found a.o
. That definition was a weak one, but that didn't matter. No
strong definition had been seen. a.o
was extracted from a.a
and linked,
and the reference to foo
was thus resolved. Next b.a
was reached, where
a strong definition of foo
would have been found in b.o
, if the linker still needed one
and looked for it. But it didn't need one any more and didn't look. The linkage:
gcc main.o a.a b.a
is exactly the same as:
gcc main.o a.o
And likewise the linkage:
$ gcc main.o b.a a.a
is exactly the same as:
$ gcc main.o b.o
Your real problem...
... emerges in one of your comments to the post:
I want to override [the] original function implementation when linking with a testing framework.
You want to link a program inputting some static library lib1.a
which has some member file1.o
that defines a symbol foo
, and you want to knock out
that definition of foo
and link a different one that is defined in some other object
file file2.o
.
__attribute__((weak))
isn't applicable to that problem. The solution is more
elementary. You just make sure to input file2.o
to the linkage before you input
lib1.a
(and before any other input that provides a definition of foo
).
Then the linker will resolve references to foo
with the definition provided in file2.o
and will not try to find any other
definition when it reaches lib1.a
. The linker will not consume lib1.a(file1.o)
at all. It might as well not exist.