Fantastic Variables and Where to Find Them

To understand the scope and the behaviors of each kind of variables in C++, we need to look further into assembly language to find out where they are stored, as assembly language is so close to the machine language.

c - Why are instructions addresses on the top of the memory user space  contrary to the linux process memory layout? - Stack Overflow

Local Variables

int main() {
    int i = 123456;
    return 0;
}

With no doubt, this kind of variables will be store in stack, as in assembly program:

	movl	$123456, -4(%rbp)

It is very clear that rbp is the frame pointer points to the base of the current stack frame, by the way, rsp points to the top of the current stack from and it has a lower address than rbp because the stack starts at a high memory address and grows downwards.

Const Local Variables

Generally, this kind of variables will be store in stack, but different way of definition leads to different behaviors in the assembly:

  1. Define by a constant number or a const variable:

    const i = 10;
    \\ or
    const i = 10;
    const j = i;
    

    Will reserve a chunk of memory in the stack, but since the compiler are pretty sure about the value of a const variable, it wouldn’t read the memory, it will just replace every usage of this variable to the actual number. We can change the data in the memory, but since it wouldn’t check the memory, it will be useless.

    	movl	$1234567, -4(%rbp)
    	movl	$1234567, %esi
    	movl	$.LC0, %edi
    	movl	$0, %eax
    	call	printf
    

    Here the compiler reserve -4(%rbp) for the variable, but when calling function printf, it directly add the actual number into the register.

  2. Define by a variables which is not const:

    int a = 10;
    const i = a;
    

    Because the value of a is unsure, the value of i is unsure. So every time the compiler want to know the value of i, it has to check the memory. Then we have such a problem: Could we change the value of i?. The answer is no and yes.

    • We cannot change the value of i directly by assigning a new constant to it, the compiler will make sure such illegal operation doesn’t take place in the code.
    • We can change it by write a constant into the memory of i. To access the memory, we can create a new pointer points to the address &i, we might need to cast the type of &i but after that we can rewrite the value of i.

Member variables

Member variable can be stored in the stack or the heap depends on how does programmer declare it.

Const Member Variables

No matter how is it declared, it will always hold a chunk of memory and visit it very time it is accessed. The reason is the value of the variable is unsure until the class or the structure is constructed.

Global Variables

This kind of variables are stored in the Read/write segment, if it is initialized, it will be stored in .data and it is is not, it will be stored in .bss. In addition, normal global variables will be label as .global, then it will be visible to the linker ld, so it will also be available to other programs linked with it.

This symbol also explain why we cannot have global variables with the same name across multiple files.

/* Without initialization*/	
	.globl	global
	.bss
	.align 4
	.type	global, @object
	.size	global, 4
global:
	.zero	4 /* Declare 4 bytes initialized to 0 */
	
/* With initialization*/	
	.globl	global
	.data
	.align 4
	.type	global, @object
	.size	global, 4
global:
	.long	10 /* Initialoized to 10 */

Const Global Variables

Basically same behavior as const local variables, but it is stored in Read-only code segment, specifically, .rodata segment. It means if we want to change a data inside .rodata, we will get a error during the runtime. There are two way to extract it from .rodata:

  1. Initialize it with a non const variable.
  2. Using volatile.

Good to know: const keyword will remove .globl label for variables, so if you declared a variable as const, it will be not visible to other files.

Volatile

Using volatile keyword can enforce the compile to read the value of a variable from memory every time it try to access the variable. So it will not just replace the variable to its value, also it will not place a const global variable to .rodata segment.

Static Variables

There are not much different between a local static variable and a ‘global’ static variable (not really global) except the way they are named in the assembly program. They are stored in Read/write segment and without .global. Which means they can be accessed anywhere in the file, but not across the file. Yes, if you return the address of a local static variable, you can access it out of the function.

For member static variables, because the static member variables of all instances are actually the same, so we always want to make sure that these variables have memories before any instance is constructed. To do this, C++ enforce (Notice, I didn’t say compiler enforce) programmers to initialize the member static variables outside of the class:

class testClass{
    static int variableA;
};

int testClass::variableA =123234;

And it will be stored as:

	.globl	_ZN9testClass9variableAE
	.data
	.align 4
	.type	_ZN9testClass9variableAE, @object
	.size	_ZN9testClass9variableAE, 4
_ZN9testClass9variableAE:
	.long	123234

As shown, it is labeled by .globl, here comes the question: Can we initialize it in other file since the linker will link all files together eventually? The answer is YES, as argue in Previous Blog, a class is just a set of .weak function to the assembly program, so we can redefine a class in many different files, for example:

// in main.cpp
class test{
public:
    static int number;
    //[...anything...]
};

As I emphasized before, initialization of the static member ain’t enforced by the compiler, so we have no problem with compiling this file, but if we run this program we will have the problem saying number is undefined. Now we require a global definition of number, and we can define it in any other files:

// any other file
class  test{
public:
    static int number;
};
// we have to redefined the class and the variable here because when compiling this file, compiler don't care about other files so it don't know the existence of test::number, then we will fail the compilation.

int test::number = 10;

It works based on three propoties of C++:

  1. The definition of static member variables isn’t check by the compiler.
  2. Classes are stored as a set of .weak function in the memory structure.
  3. Static member variables are visible to the linker.

Const Static Variables

As mention before, const will remove .globl label, so const static ‘global’ variables are exactly the same as const global variables, and const static ’local’ variables are only different by naming method.

For member const static variables, since the compiler is very sure about the value of it, it allows programmers to declare it in the class, and will simply replace every references to this variable to the actual value. In this case, the variables are stored in the .rodata segment.