The Basics: Single Inheritance
As we discussed in class, single inheritance leads to an object layout with base class data laid out before derived class data. So if classes A
and B
are defined thusly:
class A {
public:
int a;
};
class B : public A {
public:
int b;
};
then objects of type B
are laid out like this (where "b" is a pointer to such an object):
b --> +-----------+
| a |
+-----------+
| b |
+-----------+
If you have virtual methods:
class A {
public:
int a;
virtual void v();
};
class B : public A {
public:
int b;
};
then you'll have a vtable pointer as well:
+-----------------------+
| 0 (top_offset) |
+-----------------------+
b --> +----------+ | ptr to typeinfo for B |
| vtable |-------> +-----------------------+
+----------+ | A::v() |
| a | +-----------------------+
+----------+
| b |
+----------+
that is, top_offset
and the typeinfo pointer live above the location to which the vtable pointer points.
Simple Multiple Inheritance
Now consider multiple inheritance:
class A {
public:
int a;
virtual void v();
};
class B {
public:
int b;
virtual void w();
};
class C : public A, public B {
public:
int c;
};
In this case, objects of type C are laid out like this:
+-----------------------+
| 0 (top_offset) |
+-----------------------+
c --> +----------+ | ptr to typeinfo for C |
| vtable |-------> +-----------------------+
+----------+ | A::v() |
| a | +-----------------------+
+----------+ | -8 (top_offset) |
| vtable |---+ +-----------------------+
+----------+ | | ptr to typeinfo for C |
| b | +---> +-----------------------+
+----------+ | B::w() |
| c | +-----------------------+
+----------+
...but why? Why two vtables in one? Well, think about type substitution. If I have a pointer-to-C, I can pass it to a function that expects a pointer-to-A or to a function that expects a pointer-to-B. If a function expects a pointer-to-A and I want to pass it the value of my variable c (of type pointer-to-C), I'm already set. Calls to A::v()
can be made through the(first) vtable, and the called function can access the member a through the pointer I pass in the same way as it can through any pointer-to-A.
However, if I pass the value of my pointer variable c
to a function that expects a pointer-to-B, we also need a subobject of type B in our C to refer it to. This is why we have the second vtable pointer. We can pass the pointer value(c + 8 bytes) to the function that expects a pointer-to-B, and it's all set: it can make calls to B::w()
through the (second) vtable pointer, and access the member b through the pointer we pass in the same way as it can through any pointer-to-B.
Note that this "pointer-correction" needs to occur for called methods too. Class C
inherits B::w()
in this case. When w()
is called on through a pointer-to-C, the pointer (which becomes the this pointer inside of w()
needs to be adjusted. This is often called this pointer adjustment.
In some cases, the compiler will generate a thunk to fix up the address. Consider the same code as above but this time C
overrides B
's member function w()
:
class A {
public:
int a;
virtual void v();
};
class B {
public:
int b;
virtual void w();
};
class C : public A, public B {
public:
int c;
void w();
};
C
's object layout and vtable now look like this:
+-----------------------+
| 0 (top_offset) |
+-----------------------+
c --> +----------+ | ptr to typeinfo for C |
| vtable |-------> +-----------------------+
+----------+ | A::v() |
| a | +-----------------------+
+----------+ | C::w() |
| vtable |---+ +-----------------------+
+----------+ | | -8 (top_offset) |
| b | | +-----------------------+
+----------+ | | ptr to typeinfo for C |
| c | +---> +-----------------------+
+----------+ | thunk to C::w() |
+-----------------------+
Now, when w()
is called on an instance of C
through a pointer-to-B, the thunk is called. What does the thunk do? Let's disassemble it (here, with gdb
):
0x0804860c <_ZThn8_N1C1wEv+0>: addl $0xfffffff8,0x4(%esp)
0x08048611 <_ZThn8_N1C1wEv+5>: jmp 0x804853c <_ZN1C1wEv>
So it merely adjusts the this
pointer and jumps to C::w()
. All is well.
But doesn't the above mean that B
's vtable always points to this C::w()
thunk? I mean, if we have a pointer-to-B that is legitimately a B
(not a C
), we don't want to invoke the thunk, right?
Right. The above embedded vtable for B
in C
is special to the B-in-C case. B's regular vtable is normal and points to B::w()
directly.
The Diamond: Multiple Copies of Base Classes (non-virtual inheritance)
Okay. Now to tackle the really hard stuff. Recall the usual problem of multiple copies of base classes when forming an inheritance diamond:
class A {
public:
int a;
virtual void v();
};
class B : public A {
public:
int b;
virtual void w();
};
class C : public A {
public:
int c;
virtual void x();
};
class D : public B, public C {
public:
int d;
virtual void y();
};
Note that D
inherits from both B
and C
, and B
and C
both inherit from A
. This means that D
has two copies of A
in it. The object layout and vtable embedding is what we would expect from the previous sections:
+-----------------------+
| 0 (top_offset) |
+-----------------------+
d --> +----------+ | ptr to typeinfo for D |
| vtable |-------> +-----------------------+
+----------+ | A::v() |
| a | +-----------------------+
+----------+ | B::w() |
| b | +-----------------------+
+----------+ | D::y() |
| vtable |---+ +-----------------------+
+----------+ | | -12 (top_offset) |
| a | | +-----------------------+
+----------+ | | ptr to typeinfo for D |
| c | +---> +-----------------------+
+----------+ | A::v() |
| d | +-----------------------+
+----------+ | C::x() |
+-----------------------+
Of course, we expect A
's data (the member a
) to exist twice in D
's object layout (and it is), and we expect A
's virtual member functions to be represented twice in the vtable (and A::v()
is indeed there). Okay, nothing new here.
The Diamond: Single Copies of Virtual Bases
But what if we apply virtual inheritance? C++ virtual inheritance allows us to specify a diamond hierarchy but be guaranteed only one copy of virtually inherited bases. So let's write our code this way:
class A {
public:
int a;
virtual void v();
};
class B : public virtual A {
public:
int b;
virtual void w();
};
class C : public virtual A {
public:
int c;
virtual void x();
};
class D : public B, public C {
public:
int d;
virtual void y();
};
All of a sudden things get a lot more complicated. If we can only have one copy of A
in our representation of D
, then we can no longer get away with our "trick" of embedding a <