Simplifying the git forking workflow

bendemboski

Ben Demboski

Posted on April 13, 2021

Simplifying the git forking workflow

The standard way to contribute to an open source project that you do not maintain is to fork it, create a branch in your fork where you put your code changes, and then open a pull request into the original repository. I've been doing this for years, and just discovered a tweak to the workflow that I really like and want to share with you.

The git forking workflow

This is a well established pattern that has been written up a number of times (e.g. here and here), so I'll just briefly outline the process of opening a pull request, and then opening a second pull request, as I originally learned it.

First pull request

To open my first pull request, I need to first create a fork. I'll use the Ember test helpers repo as my example that I'm contributing to. Note that none of this is specific to GitHub -- it would work the same with BitBucket or any other git host. My steps are:

  1. Fork the repo, so the original is at git+ssh://git@github.com/emberjs/ember-test-helpers and my fork is at git+ssh://git@github.com/bendemboski/ember-test-helpers.
  2. git clone git+ssh://git@github.com/bendemboski/ember-test-helpers
  3. git checkout -b my-branch-1
  4. Write code
  5. git push -u origin my-branch-1

and now I'm ready to open a pull request! This is nice and simple, but where I think it gets a little complicated is when I want to make my second pull request.

Second pull request

If some time has passed, Ember test helpers' master branch will have changed since I created my fork as work on the project continues -- at the very least, I hope my pull request was merged into it! So when creating my new branch, I need to make sure to branch off of the latest master in the original repo, not the out-of-date one in my fork.

The steps are:

  1. git remote add upstream git+ssh://git@github.com/emberjs/ember-test-helpers
  2. git fetch upstream master
  3. git checkout master
  4. git merge upstream/master
  5. git checkout -b my-branch-2
  6. Write code
  7. git push -u origin my-branch-2

Steps 1-4 are just syncing my local mirror of my fork's master branch with the master branch in Ember test helpers' repo. Since I don't actually do anything with my fork's master branch, I could simplify this slightly:

  1. git remote add upstream git+ssh://git@github.com/emberjs/ember-test-helpers
  2. git fetch upstream master
  3. git checkout -b my-branch-2 upstream/master
  4. Write code
  5. git push -u origin my-branch-2

Simplified git forking workflow

The simplification involves a tweak to the workflow that is pretty minor from a mechanical/what-commands-do-I-type standpoint, but I think simplifies the mental model significantly.

First pull request

Instead of cloning my fork of the repo, I will clone the original repo and then add my fork as another remote:

  1. Fork the repo, so the original is at git+ssh://git@github.com/emberjs/ember-test-helpers and my fork is at git+ssh://git@github.com/bendemboski/ember-test-helpers.
  2. git clone git+ssh://git@github.com/emberjs/ember-test-helpers <-- this is the key difference
  3. git checkout -b my-branch-1
  4. Write code
  5. git remote add bendemboski git+ssh://git@github.com/bendemboski/ember-test-helpers
  6. git push -u bendemboski my-branch-1

The server-side result is the same -- my fork has a my-branch-1 branch ready to use for a pull request into the original repo, but the local setup is different in a way that makes opening subsequent pull requests somewhat simpler.

Second pull request

Since I have cloned the original repo, I sync my local master branch just like I would with any other branch, simplifying the beginning of this workflow:

  1. git checkout master
  2. git pull
  3. git checkout -b my-branch-2
  4. Write code
  5. git push -u bendemboski my-branch-2

Practical differences

The only practical difference between these two versions of the forking workflow is that in the simplified form I'm not trying to keep my fork's master up-to-date with the original repo's. In fact, I completely ignore my fork's master branch and just treat my fork as a repository for pushing temporary branches to support opening pull requests.

Ergonomic benefits

Even though the pure number of commands I need to type isn't significantly reduced, in my experience, this simplifies the mental model in a way that reduces friction in the whole process of opening pull requests. The benefits I've experienced are:

  • I don't have to worry about whether my fork's master branch is up-to-date with the original repo's master branch
  • I don't have to worry about accidentally merging code into my fork's master branch in a way that would require something like rebase to get back in sync with the original repo's master branch
  • I don't have to think about the fact that I'm working with a fork of a repository aside from the one time (per branch) that I have to push my branch to the remote pointing to my fork (git push -u bendemboski ... instead of git push -u origin ...). All of my pulling and branching operations are done just as if I owned the repository, and it's only the first time I push a branch that I have to do something different.

These may not seem like a huge deal, but for me they are, because of the mental simplification of not having to switch between two different "modes" -- the "working on an original repo" mode and the "working on a fork of a repo" mode. When I'm working in my local clone, it makes no difference and I do the same thing either way, and it's only when I need to push a new branch to somewhere remote that I have to think about the difference, and that's exactly when I should be thinking about the difference!

This peels off one extra layer of mental load and reduces the friction involved in the whole process.

An extra thought

There can be good reasons to use the forking workflow even for repositories that you can push to, e.g. to keep from polluting the original repository with experimental branches, etc. As the ever-insightful @katiegengler pointed out in a Discord discussion, in such cases following the simplified forking workflow but cloning the original repo using the https instead of git URL (git clone https://github.com/emberjs/ember-test-helpers instead of git clone git+ssh://git@github.com/emberjs/ember-test-helpers) adds an extra layer of protection preventing you from accidentally pushing to the original repo instead of your fork.

Conclusion

I've found this tweak to the workflow to be a non-trivial simplification that noticeably improves my developer experience of periodically contributing to projects that I don't own. I love the open source model, and I love contributing back to projects that I have benefited from, so I'm always excited to find ways of reducing friction in the process to make me more likely to do it, and free up energy for the actual development work that's fun rather than the git mechanics that are...less fun. I'd love to hear what you think about this simplified workflow, or other ways you've found to reduce friction in contributing to open source projects.

💖 💪 🙅 🚩
bendemboski
Ben Demboski

Posted on April 13, 2021

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related