How classes are stored in memory

We only talk about member variables and member functions here, others such as static function or const stuffs will make this topic much more difficult to discuss, I’ll argue it in the future (maybe lol).

We know that during the runtime, the instance is stored in the stack or heap (precisely, most of the variables are in the stack), but before a class is instantiate, where is it and how does it go into stack.

We first define such a class, and generate the corresponding assembly code, The code here only shows the key parts:

class BaseA {
public:
    BaseA(){
        printf("Construction");
    }
    int size = 5;
    string hello = "Hello";
    void Function1() {
        printf("This is Function 1");
    };
};

	.section	.rodata
.LC0:
	.string	"Hello"
.LC1:
	.string	"Construction"
	.section	.text._ZN5BaseAC2Ev,"axG",@progbits,_ZN5BaseAC5Ev,comdat
	.align 2
	.weak	_ZN5BaseAC2Ev
	.type	_ZN5BaseAC2Ev, @function
_ZN5BaseAC2Ev:
.LFB1732:
	[...]
	subq	$40, %rsp
	[...]
	movq	%rdi, -40(%rbp)
	movq	-40(%rbp), %rax
	movl	$5, (%rax)
	movq	-40(%rbp), %rax
	leaq	8(%rax), %rbx
	leaq	-17(%rbp), %rax
	movq	%rax, %rdi
	call	_ZNSaIcEC1Ev
	leaq	-17(%rbp), %rax
	movq	%rax, %rdx
	movl	$.LC0, %esi
	movq	%rbx, %rdi
	[...]
	call	printf
	[...]
	.section	.rodata
.LC2:
	.string	"hello"
	.section	.text._ZN5BaseA9Function1Ev,"axG",@progbits,_ZN5BaseA9Function1Ev,comdat
	.align 2
	.weak	_ZN5BaseA9Function1Ev
	.type	_ZN5BaseA9Function1Ev, @function
_ZN5BaseA9Function1Ev:
.LFB1734:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register 6
	subq	$16, %rsp
	movq	%rdi, -8(%rbp)
	movl	$.LC2, %edi
	movl	$0, %eax
	call	printf

Stack frame layout on x86-64 - Eli Bendersky’s website

It is obvious that function _ZN5BaseAC2Ev is the construction function of BaseA, what it does is:

Reserve 40 bytes memory in stack for local variables by calling $40, %rsp, since sizeof(int) is 8 and sizeof(string) is 32.
From above image we know that rsp (stack pointer) is always lower than rbp (frame pointer), and the stack is growing from lower address to higher address. So -40(%rbp) is actually equal to %rsp while 40(%rsp) is equal to %rbp.
We can directly see the operation of assigning 5 to member variables in the code, after that, the assembly program call the string construction function and assign the string value to -17(%rbp), which is the end of the integer.
The last line is calling the printf function to print "Construction".

The second part is the member function of this class.

In conclusion, a class is stored as a set of function (basically, construction function and member function) in the Read-Only code segment, it won’t assign memory to its member variables until we initiate it, and the memory assignment will be in the construction function.

Here comes the problem: What if we want to prevent the implementation of a member function be changed by other files. Local function aren’t declare as weak by default, so we can declare a member function using the method we declare local function:

class test{
public:
    void AddData();
};

void test::AddData() {
    number = number + 20;
}

Another problem, Although different compiler have different optimization strategy, but most of them will optimized out unused member function, what if my member function is actually useful in other file but compiler optimized it out? the answer is the same as the above question, because compiler wouldn’t optimize unused public function, as it might be useful in other files. But be aware of the multi-definition problem of strong type and the link order problem of multiple weak type. See Strong and weak Symbol.