Understanding Memory Alignment
class Practice
{
int num;
char c;
};int main(int argc, char const *argv[])
{
std::cout << sizeof(Practice) << std::endl;
return 0;
}
int is 4 bytes, char takes a single byte. So, would the output be (4+1=) 5?
./test.exe
> 8
What happened
On a 64 bit system, the unit or the granularity of memory access operation is 8 bytes. which means when a CPU fetches data from the memory, it reads a 64-bit block at once. It’s a fixed amount of work that we can not reduce on a software level. But we can still try to get the best performance by minimizing memory access. In order to be efficient, we need to understand what would be considered as wasteful:
Say the task is to access an int(4 bytes) located at 0x06, here’s what an 8-byte granular CPU needs to do:
- reads 0x00~0x08,
- reads 0x08~0x0F
- shifts left to get rid of unneeded 0x00~0x05
- shifts right to get rid of unneeded 0x0B~0x0F
- merges the results into the register
That’s a lot of work just to read an int. In fact, we’re lucky that modern CPUs even support this. Back in the day, operations like this will just crash the program with an unaligned access exception.
Now we know the bad, let’s look at how to prevent it. Well, lucky for us, modern compilers actually handle the alignment for us by adding extra unused space(padding) to where it’s needed.
So back to the first example, padding was added after the char in order to make the struct 4-byte aligned nicely. For structs, alignment is determined based on the largest member variable (int, 4 bytes, in our case)
std::cout << alignof(Practice) << std::endl;> 4
It’s also possible to override the compiler’s default behavior and manually specify alignment value for special requirements:
#pragma pack(2) // for MSVC only, different for other compilersclass Practice
{
int num;
char c;
};int main(int argc, char const *argv[])
{
std::cout << sizeof(Practice) << std::endl;
std::cout << alignof(Practice) << std::endl;
return 0;
}> 6
> 2