Note: This post is x86 centric. On other architectures your mileage may vary.
Compiling with the
-Wpadded flag on GCC/Clang or
-we4820 -we4121 on MSVC, warns you when the compiler inserts padding in your
These warnings are off by default, because in most circumstances automatic padding is quite convenient, but if you are writing high performance code or embedded systems, padding explicitly and trying to avoid padding may be the better choice.
Why compiler padding is a thing
Compilers insert padding to keep data structures aligned, thus avoiding misaligned reads and making memory access faster.
The rules for when compilers insert padding depend on the target architecture’s word size and the size of each member field in relation to the following member as explained on “Typical alignment of C structs on x86”.
If the structure contains members with explicit alignment (i.e. types declared with
__attribute__((aligned(n))) on GCC/Clang or
__declspec(align(n)) on MSVC), then the compiler will take that in consideration when padding.
Examples of automatic padding
Take the following structure:
It is compiled on x86_64 into an equivalent of the following:
The member field
_COMPILER_PADDING is not really there, but it is as if it was there, because the compiler inserted the 3 bytes of padding between
Y. If you compile the struct yourself and check
sizeof(Paddee) you will see that it is 8 bytes, not 5.
The padding is also added at the tail end of the
struct, so if you rearrange the members the compiler will pad like so:
For a more realistic example, in the game engine for Rival Fortress I have a 4x4 matrix type that looks something like this:
GCC_ALIGN expand to GCC/Clang or MSVC alignment macros. They tell the compiler that this type should always be aligned to 16 byte boundaries (in order to play nice with SSE instructions).
Because of the forced alignment constraint every other type that includes
MPEMatrix4 as a member will inherit its alignment requirement. For example, the following dummy type:
Requires 12 bytes of padding(!) in order to be properly aligned. This translates into a lot of wasted memory bandwidth when dealing with arrays with thousands of entries, like for example entities in a game.
The solution, in an extreme case like this, is to either rethink the
MPEExample and move the
Flags field somewhere else (maybe a parallel array that you loop through before or after), or fill the 12 empty bytes with useful data in order to eliminate the wasted memory.
Tips for eliminating compiler padding
You avoid automatic padding by making the compiler happy and aligning your structures optimally. The
-Wpadded compiler flag is your guide in knowing when a structure needs better alignment.
The common ways to align structures manually are:
- Rearrange fields: reorder the field in order to maximize packing. I don’t know about GCC and MSVC, but Clang warns you about misaligned fields, so you know where the problem is.
- Group small types: grouping
ints after larger types, like pointers, can lead to better alignment.
- Use smaller/bigger types: when possible choose different primitive types, like a
uint16_tinstead of an
int, or a
size_tinstead of an
uint32_t. You can then tie this back to the previous tip about grouping small types.
- Insert dummy fields: the last resort is to insert fields in your
structsbetween members that require alignment or at the end of the
struct. This is what the compiler does, but by doing yourself you have a reminder that you can act on when you modify the
struct. I usually add a byte array named
_PADDINGof the size required to reach alignment.
Keep in mind that for large structures, cache line size becomes relevant, so try not to not break groups of fields that you want pulled into the same cache line when restructuring your
The excellent post The Lost Art of C Structure Packing goes in detail on how to optimally pack structures in C.
What to do when you can’t align, but don’t want padding
Padding is not always a good thing. For example when serializing data types to disk or over the network, it is often better to keep structures tightly packed even if it causes unaligned memory reads.
To selectively disable padding you can use the
#pragma pack directive like so:
This will disable compiler padding and keep
sizeof(MPEExample) equal to the sum of the sizes of its members (in this case 20 bytes, instead of 32).