Metric Panda Games

One pixel at a time.

Simple Crash Resurrection For C/C++ Game Engines

Rival Fortress Update #36

The video demonstrates the crash resurrection functionality in Metric Panda Engine.

Crash resurrection of Metric Panda EngineRead More

A simple feature that, by using the engine’s hot reloading functionality, reloads a previous version of the game .dll/.so when a crash occurs.

This is development only feature, that gives you a chance to quickly correct simple mistakes while debugging that would otherwise cause you to lose the state of the game.

I would strongly suggest against doing such shenanigans in shipping builds.

Implementing Crash Resurrection in Your Engine

The process of gracefully recovering from most crashes is simple and can be broken up in the following steps:

  • Step 0: Prerequisite: Hot reloading
  • Step 1: Backup the previous .dll
  • Step 2a: Add a crash handler on Linux/Mac
  • Step 2b: Add a crash handler on Windows
  • Step 3: Augment the Hot Reload functionality
  • Step 4: What about the debugger

I’ll be referring to dll for simplicity when talking about shared libraries, but replace it with the extension used by your OS.

Step 0: Prerequisite: Hot reloading

Hot reloading or Live Code Editing is the act of unloading and reloading a shared library without having to restart the main application. This is usually done whenever a newer version of the shared library is detected.

If you have used Unreal Engine or watched Casey Muratori’s excellent Handmade Hero series, you know how useful this feature can be, especially when rapidly iterating over code that needs to feel “just right”.

If your engine supports hot reloading read on, otherwise take a look at Handmade Hero’s Day 22: Instantenous Live Code Editing for a Windows implementation, and Interactive Programming in C by Chris Wellons, for Unix-like systems.

Step 1: Backup the previous .dll

In your build script, before your game library compilation step, make a backup of the previous version of the .dll, for example copy it to game-backup.dll.

You also have to check if the backup is being used by the game because of a recent crash. If this is the case you shouldn’t clobber it as the new code you are about to compile may crash too. You can do this check using the lsof command on Unix-like systems and Handle on Windows.

For example you could use the following bash script on Unix-like systems:

#!/bin/bash

source_lib=game.so
backup_lib=game-backup.so

# Copy $source_lib to $backup_lib only if:
#  - $source_lib exists and
#  - $backup_lib is not open by a process
if [[ -f "$source_lib" && ! "$(lsof $backup_lib 2> /dev/null)" ]]; then cp $source_lib $backup_lib; fi

I’m not really a Windows batch or PowerShell wizard, so I’ll leave the Windows implementation as an exercise for the reader.

Step 2a: Add a crash handler on Linux/Mac

On most Unix-like systems you can use POSIX Signals to recover from crashes. For example, the following code registers a signal handler for SIGSEGV and SIGILL, as I’ve found they are the most common and easily recoverable from, but you can register for any signal you are interested in handling.

#include <signal.h>
#include <setjmp.h>

sigjmp_buf RecoveryMarker;
int GameRunning;

int ReloadGameCode(int UseBackup);

void SignalHandler(i32 Signal)
{
  siglongjmp(RecoveryMarker, -1);
}

int main(void)
{
  struct sigaction SignalAction = {};
  SignalAction.sa_handler = &CrashHandler;
  sigemptyset(&SignalAction.sa_mask);
  sigaction(SIGSEGV, &SignalAction, 0);
  sigaction(SIGILL, &SignalAction, 0);

  if(sigsetjmp(RecoveryMarker, 0) != 0)
  {
    if (!ReloadGameCode(1))
    {
      exit(EXIT_FAILURE);
    }
  }

  while (GameRunning)
  {
    // Game loop
  }
  return 0;
}

Step 2b: Add a crash handler on Windows

Windows provides the Structured Exception Handling (SEH) mechanisms that you can use to recover from a crash.

The following example is more or less equivalent to the Linux/Mac implementation presented above. I hope you don’t get heart palpitations at the sight of a goto statement ;-).

int GameRunning;

int ReloadGameCode(int UseBackup)

int main(void)
{
RecoveryMarker:
  __try
  {
    while (GameRunning)
    {
      // Game loop
    }
  }
  __finally
  {
    // Can use __except along with GetExceptionCode() and GetExceptionInformation() 
    // to see if exception can be recovered from
    if (ReloadGameCode(1))
    {
      goto RecoveryMarker;
    }
  }
  return 0;
}

Step 3: Augment the Hot Reload functionality

The ReloadGameCode function stubbed in the previous examples is in charge of hot reloading the game .dll. In its simplest form it can receive an argument receive an argument that tells it whether or not to reload the main .dll or a backup, like so:

int ReloadGameCode(int UseBackup)
{
  int Success = 0;
  const char* Filename;
  if (UseBackup)
  {
    Filename = "game-backup.dll";
  }
  else
  {
    Filename = "game.dll";
  }
  // Unload and reload shared library here. See references in
  // Step 0: Prerequisite: Hot reloading
  // for sample implementations
  Success = ...;

  return Success;
}

Step 4: What about the debugger

With GDB and GDB frontends, you can disable automatic breakpoints for specific signals using the handle command (like I do in the demo video for SIGSEGV).

I’m not sure how to do the same in Visual Studio, as I don’t tend to use it much, but I guess there’s a similar feature… hopefully…maybe… I don’t know, sorry! :o(