Git Subtrees
I share Metric Panda’s Engine code with multiple projects, and since I’m actively adding features to it I often tend to make changes or fix bugs while working on the parent project.
I use Git for source control and it is important for me to be able to make commits to either the engine or the parent repository separately and cleanly.
The most common use case is: I’m working on some feature in the parent project, I spot a bug in the nested project that I fix, and without having to commit anything to the parent project, I want to be able to commit the bugfix to the child project.
In this post I’ll talk about common ways to solve this problem using Git.
The ugly
The naive solution is to copy or symlink code and manually copy changes to the nested code to all projects. This becomes a nightmare quickly as things can get out of sync because of human error. Also, it becomes harder to track bugs using tools git bisect when the parent repository’s history contains commits relating to the child repository.
The bad
The second solution is to use git submodules. Unfortunately submodules are a less than ideal solution for simple use cases like mine and they bring along a lot of baggage and overhead that often times gets in the way and makes the workflow more complicated.
The problem with Git submodules has been summarized perfectly by Oren Eini’s blog post:
- You can’t just git clone the repository, you need to clone the repository, then call git submodule init & git submodule update.
- You can’t just download the entire source code from Github.
- You can’t branch easily with submodules, well, you can, but you have to branch in the related projects as well. And that assumes that you have access to them.
- You can’t fork easily with submodules, well, you can, if you really feel like updating the associations all the time. Which is really nasty.
The good
Since I started working on Rival Fortress I’ve been using git subtrees as a way to embed the engine in various projects.
The thing I like about subtrees is that the workflow is much cleaner:
- the code for the engine is “embedded” directly in the source tree of the parent project,
- both repositories have a separate commit history,
- it is easy to freeze the engine repository to a particular commit
- when a bugfix is made in the engine’s repository it can be easily pushed from within the parent project
Subtrees aren’t perfect, though. I’ve had some issues with having to manually resolve merge conflicts when updating the subtree and the commands to manage a subtree are impossible to remember, and if you execute the wrong command you can clobber code on the remote repository, so tread carefully.