How I Use __COUNTER__ To Localize Text And Hash Strings At Compile Time
__COUNTER__
is a preprocessor macro available in most compilers that expands to sequential integers starting from zero.
It resets at the beginning of each new translation unit (i.e. each object file), and is commonly used with the paste (##
) operator to create unique identifiers within other macros.
While on working on the localization bits for Rival Fortress, I stumbled upon an interesting usage for __COUNTER__
that has allowed me to generate fast localization lookups and string hashing that are resolved at compile time and have an API that looks like this:
In this post I’ll talk about what goes on behind the scenes of the previous snippet of code.
Guaranteed monotonic
I’m compiling the project as a single compilation unit (a.k.a. unity build), meaning all source files are included from one master source. The main advantage of this approach is fast compilation times: the codebase currently clocks in at about 95k LOC and compiles in less than a second in debug mode.
The other advantage of using a unity build is that __COUNTER__
is guaranteed to be strictly monotonic, meaning no two values it produces are equal.
Having a unique counter means that lookup tables can become a thing.
Lookup tables indexed on compile time constants are trivial to inline by any compiler worth its salt, so that’s exactly what I did.
Metareflect refresher
I previously talked about the custom reflection preprocessor that I implemented in order to automate generation of config files from structs
.
As a refresher, Metareflect, as I’ve called it, is a standalone executable that runs before the compiler, as part of the build process.
It lexes and parses C code in the same manner as a compiler front end does in the preprocessing phase, and looks for special annotation tokens that look like this:
As you can see, the MREFLECT()
macro expands to nothing at compile time, but is used as an annotation that Metareflect understands and uses as a directive to generate code.
The previous snippet, for example, would cause Metareflect to generate the code needed for reading and writing the struct
to and from an INI
. Config
and Default
are options that, in this case, tell Metareflect that the configuration setting should be placed in the [Engine]
section with a default value of DEFAULT_FULLSCREEN_MODE
.
The generated code is saved in the src/generated/
folder and included by the rest of the codebase. If you are familiar with Unreal Engine, I’ve based Metareflect on their “UPROPERTY” reflection system.
Generating translation lookup tables
I expanded Metareflect making it generate the code for a lookup table for translation entries using __COUNTER__
, and this is what it looks like:
The switch
statement maps each __COUNTER__
value to a string. As you can see, it handles duplicate strings by collapsing case
statements.
The number of entries in the GlobalTranslationTable
is calculated by Metareflect by counting unique entries passed to the T(X)
macro. These entries are stored in a simple hashtable-like data structure that uses the argument of the T(X)
macro hashed as a 32 bit unsigned integer as key. Eventual key collision can easily resolved by adding a second parameter to the macro and using it as seed for the hash function.
GlobalTranslationTable
is populated at runtime from either from the default char*
array, that contains the entries found in the source code or from a binary localization file, also generated by Metareflect. Changing language is simply a matter of memcpy
-ing the correct translation table over the global.
This approach is very fast, as the Translate
function is guaranteed to be inlined because Counter
is known at compile time. The compiler will replace each call to Translate
with a mov
instruction pointing to the offset in the GlobalTranslationTable
that in turn contains a pointer to the localized string.
Outputting Translator friendly CSV files
Using Metareflect I’m also generating CSV files for translators. The CSVs use the argument of the T(X)
macro as key, so the binary translation file can be remapped to the correct __COUNTER__
even if the lines of code are swapped. This is a one time operation that happens on startup.
Hashing strings at compile time
Leveraging the same code that generates translation lookup tables, I was also able to make lookup tables for string hashes that look like this:
This function is also easily inlined by the compiler so it collapses down to just numbers that replace the macro invocations.
Does it slow down compilation time
No. Currently Metareflect is able to do its thing in less than 40ms, spitting out about 15k LOC that include code generated for custom data structures, allocators, INI/JSON reader/writers, networking and more.