Finding Duplicate Static Symbols in Shared Libraries
Sometimes it is useful to split code into shared libraries that get loaded by the main executable depending on runtime requirements.
For example, the current development version of Rival Fortress is structured like so:
- The Launcher is the main executable that contains shared data types, interacts with the OS and abstracts away any platform related functionality needed by the shared libraries.
- The Game is a shared library that contains all engine and game code. Keeping it in a library makes it easy to hot reload it without having to restart the game.
- The Networking library is loaded when the player initiates a multiplayer game or when the game is started as a headless dedicated server.
- The Editor is also a shared library with the code for the gameplay editor. It can be loaded and unloaded by pressing
F11
while the game is running.
Duplicate static functions
Partitioning code across multiple modules can cause logic defined in static
functions to be duplicated silently and, while it will compile just fine and won’t cause any problems at runtime, it will unnecessarily bloat the size of the .dll
/.dynlib
/.so
. Duplicate functions can also cause subtle bugs when shared libraries are built with different versions of the code, but this won’t happen if you build all your modules when common code changes.
For example, imagine you have the following function defined in utility.c
that gets included in both the Game and Editor shared libraries.
Each library will get a copy of the function and because it’s defined as static
you will not get any compiler warning about the duplication.
To see for yourself you can the nm tool on Linux and OSX or dumpbin on Windows with the /SYMBOLS
flag. Both tools will show you the symbols table exported by the library or executable. This is what nm
outputs when run on OSX:
Finding duplicate symbols
If you reverse diff the output of nm
or dumpbin
of two libraries you will find duplicate symbols.
Before you do that, though, you need to massage the output a bit. On *nix based systems, or by using bash.exe, you can use the following command:
nm
returns the list of symbols in the library- Omit this if you are using
bash.exe
on Windows 10, andcat
the output ofdumpbin
- Omit this if you are using
c++filt
demangles any C++ symbols. You can read more about it on the man page,cut
trims the first two tokens of each line that contain the memory location of the symbol, as it will more than likely differ for each library,sort
sorts the symbols in alphabetical order (instead of being sorted in ascending memory location order)uniq
removes any duplicate symbols (this is useful when deadling with C++ code)
You can optionally filter the results using grep
if you use common prefixes for all your functions, as this will remove all the noise generated by compiler defined symbols.
Use the comm command to reverse diff the outputs like so:
Now that you know what symbols are duplicated it’s up to you decide how to best clean things up.