How classes are stored in memory
We only talk about member variables and member functions here, others such as static function or const stuffs will make this topic much more difficult to discuss, I’ll argue it in the future (maybe lol).
We know that during the runtime, the instance is stored in the stack or heap (precisely, most of the variables are in the stack), but before a class is instantiate, where is it and how does it go into stack.
We first define such a class, and generate the corresponding assembly code, The code here only shows the key parts:
class BaseA {
public:
BaseA(){
printf("Construction");
}
int size = 5;
string hello = "Hello";
void Function1() {
printf("This is Function 1");
};
};
.section .rodata
.LC0:
.string "Hello"
.LC1:
.string "Construction"
.section .text._ZN5BaseAC2Ev,"axG",@progbits,_ZN5BaseAC5Ev,comdat
.align 2
.weak _ZN5BaseAC2Ev
.type _ZN5BaseAC2Ev, @function
_ZN5BaseAC2Ev:
.LFB1732:
[...]
subq $40, %rsp
[...]
movq %rdi, -40(%rbp)
movq -40(%rbp), %rax
movl $5, (%rax)
movq -40(%rbp), %rax
leaq 8(%rax), %rbx
leaq -17(%rbp), %rax
movq %rax, %rdi
call _ZNSaIcEC1Ev
leaq -17(%rbp), %rax
movq %rax, %rdx
movl $.LC0, %esi
movq %rbx, %rdi
[...]
call printf
[...]
.section .rodata
.LC2:
.string "hello"
.section .text._ZN5BaseA9Function1Ev,"axG",@progbits,_ZN5BaseA9Function1Ev,comdat
.align 2
.weak _ZN5BaseA9Function1Ev
.type _ZN5BaseA9Function1Ev, @function
_ZN5BaseA9Function1Ev:
.LFB1734:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movq %rdi, -8(%rbp)
movl $.LC2, %edi
movl $0, %eax
call printf
It is obvious that function _ZN5BaseAC2Ev
is the construction function of BaseA
, what it does is:
- Reserve 40 bytes memory in stack for local variables by calling
$40, %rsp
, sincesizeof(int)
is 8 andsizeof(string)
is 32. - From above image we know that
rsp
(stack pointer) is always lower thanrbp
(frame pointer), and the stack is growing from lower address to higher address. So-40(%rbp)
is actually equal to%rsp
while40(%rsp)
is equal to%rbp
. - We can directly see the operation of assigning 5 to member variables in the code, after that, the assembly program call the string construction function and assign the string value to
-17(%rbp)
, which is the end of the integer. - The last line is calling the
printf
function to print"Construction"
.
The second part is the member function of this class.
In conclusion, a class is stored as a set of function (basically, construction function and member function) in the Read-Only code segment, it won’t assign memory to its member variables until we initiate it, and the memory assignment will be in the construction function.
Here comes the problem: What if we want to prevent the implementation of a member function be changed by other files. Local function aren’t declare as weak by default, so we can declare a member function using the method we declare local function:
class test{
public:
void AddData();
};
void test::AddData() {
number = number + 20;
}
Another problem, Although different compiler have different optimization strategy, but most of them will optimized out unused member function, what if my member function is actually useful in other file but compiler optimized it out? the answer is the same as the above question, because compiler wouldn’t optimize unused public function, as it might be useful in other files. But be aware of the multi-definition problem of strong type and the link order problem of multiple weak type. See Strong and weak Symbol.