Logical vs Memory Structure Arrangement
In a previous post I talked about how avoiding automatic structure padding can be beneficial for performance, because of the importance of cache locality in modern CPU architectures.
Today I’ll talk about why I wish C or C++ would allow us to specify a logical arrangement for struct
members in order to increase code readability.
Data structure alignment
If you specify a structure or class in C/C++, the compiler will arrange its members in memory in the same order as you specified them in the struct
/class
definition (ignoring for the sake of simplicity virtual table pointers added in C++).
So, for example, the following struct
will have the member X
come before Y
in memory.
In this case, the compiler will also insert 3 bytes of padding after X
in order to keep Y
aligned to a word boundary.
When default arrangement sucks
Default structure arrangement can be problematic at times, especially when dealing with large structures.
For example, it is often much more readable to group fields together logically, but this can lead to memory memory wastage caused by padding, like in the following simple example:
This is a perfectly reasonable arrangement for a human: each pointer is followed by the count for the array.
Unfortunately, it also wastes memory, because of the bytes of padding introduced by the compiler. An optimal, but less readable arrangement would be the following:
This arrangement wastes no bytes in padding, but is, in my opinion, less readable than the previous.
Logical arrangement as an option
A better approach for cases like this where the order of members is not important would be to specify structure members logically, for example by decorating the structure like so:
This would tell the compiler: “Hey, I don’t really care about the order in which the members of this struct are laid out, rearrange them at will”.
Obviously this shouldn’t be the default behavior as it would cause all sorts of bugs when dealing with structures that have to cross API boundaries between libraries, but it would be very useful for internal subsystems…
I think ;)