Metric Panda Games

One pixel at a time.

Dealing with __chkstk/__chkstk_ms when Cross-Compiling For Windows

Rival Fortress Update #45

If you have a cross-compiling toolchain for building Windows executables read on.

I use both Clang and Mingw-w64, and I’ve recently discovered a “fun” little gotcha that has to do with the __chkstk routine that is output automatically by the code generator of both compilers.

What is __chkstk

The __chkstk routine (also known as __alloc_probe) is a little procedure that is inserted by compilers targeting Windows executables in the prologue code for each function that uses more that 4K bytes (8K in 64bit).

By default Windows allocates stack space in 4K pages with a guard page at the end that triggers an access violation when the program tries to access it, causing the operating system to allocate more stack space.

A problem arises when a function uses more than 2 pages for its stack variables. This means that it could possibly access memory past the guard page, thus triggering an access violation that won’t be handled by the OS as a simple request for more stack space, but as a generic exception that would terminate the application.

__chkstk is the solution to this problem. Upon function entry, it touches memory addresses every 4K from the current stack pointer location up to the size needed by the function. This triggers the guard pages in the proper sequence and commits additional memory to the stack as required.

__chkstk can also speed up your application’s start up time, even though, for most indie games and even some triple A game, the speed up will be negligible.

Read Compiler Security Checks In Depth for more details.

Why you may not want __chkstk

As you may imagine, having this little routine run on every function call is wasteful. The cost of each page fault is paid only once, but __chkstk has to do its little dance and burn cycles on instructions that do nothing every time a function is called.

Fortunately it can be disabled on MSVC with the following cl.exe flags:

  • /GsXXX where XXX is the threshold in bytes that prompts the insertion of the __chkstk probe. If you set this to a high number, like 10000000, no stack probes will be inserted.
  • /STACK:reserve[,commit] reserves and, more importantly, commits the specified bytes for stack space used by the application. By default reserve is 1 MB and commit is 4 KB, so if you set both reserve and commit to the same number you won’t have to manually trigger faults to expand stack space.

LLVM and mingw don’t support __chkstk disabling

Unfortunately LLVM’s code generator for Windows targets use an hard-coded probe size of 4K as of LLVM 5.0, and this size can only be changed on a per-function basis with the stack-probe-size function attribute.

The same goes for mingw-w64, as it automatically outputs the __chkstk_ms probe for functions that use more than 4 KB, and to my knowledge there is no way to change this, but I didn’t dig deep in the source, as I use mingw only for continuous integration, and not for my main builds.

The solution I went with is to just redefine the function __chkstk as a no-op in assembly like so:

.text
.global __chkstk
__chkstk:
  ret

When I’ll get further along and lock all toolchain versions, I’ll modify the source to the Windows LLVM code generator to remove the call to __chkstk, but for now this is a quick and painless solution.