Keep a readable Git history


Readability of code is important, there is a fact

Code is read more than it is written

This is so true, even when you are writing code, actually, you need to read the current code base again and again. Back to the ancient programming era, people are still using forbidden black magic - goto statements. Powered by the black magic, there is a demon Spaghetti code, and it looks like this

(From http://en.wikipedia.org/wiki/File:Spaghetti.jpg under Creative Commons 2.0 license)

Spaghetti code is the code hard to read and maintain, it was killing countless developers. Then a brave developer invented a new form of magic - structure programming, eventually, the demon was defeated and the black magic was forbidden since then.

This story told us readability is important, but what about readability of Git commit history? There are chances we need to look into the development history, such as finding what are the corresponding ticket for those commits? Who is the author? when are the changes introduced. Although there are tools to use, sometimes you still need to read the history and it is just unreadable and hard to understand. I bet you see much worser history than this one

It makes reading painful. Despite chance of reading development history is less than reading code, it is still very helpful to have a clean readable linear history. Today, I am going to share some experience about keeping a readable history.

Use SourceTree

It is never pleasant to use a command line tool when there is a nice GUI tool. I hate ASCII git history graph, they are just ugly. Luckily, we have an awesome free GUI tool to use - SourceTree.

Always create a new branch for ticket

When you are working for an ticket or issue, you should always create a branch for it.

You should try to keep commits in the branch only for resolving that ticket. It is okay to have some typo corrections or minor changes in it. However, if you put unrelative commit for major changes into the branch, other developers cannot know that you have some off topic changes in that branch easily. By doing branch only for one purpose, here you have

  • Easier to understand what this branch is for
  • Easier to reverse changes introduced by this branch

Here you are working on a new branch, then you can commit

After then, you have several commits and they are good to merge

We want to keep the branch in history, so remember to use non-fast-forward merge, check the Do not fast-forward when merging, always create commit option

It’s time to merge, first, right click one the master branch and click Checkout. You should be at master branch now. Then, right click new-feature branch and click Merge.

Remember to check Commit merged changes immediately to make a new commit directly.

Whoa, here we are, a beautiful linear history still with branch information.

Always rebase before merge

For now, you are writing the next awesome feature - foobar-2000!

Things go well, however, in the mean time, a new branch is merged from other guys repo. Oh my god, foobar 3000! awesome!

Okay, let’s see what it looks like to merge it directly

Ugly, let’s try something better - rebase. First, right click on foobar-2000 and click checkout. Then right click on master and click Rebase

This is better! And we can merge it like before

Rebase and force push

As usual, you keep work on this nice and beautiful linear history, however, you won’t feel safe to leave your commits on your local machine will you? We always push our working branch to GitHub to keep it safe, get reviews and feedbacks from others

Yes, again, you may hate this, there is another branch is merged into the master.

Okay, you said, this is not a big deal, I can always rebase and merge as usual. Here you rebase

Well, it is still under development, you want to push to your fork, but not to merge it. Then you push, and oops!

So what just happened?

As you can see there is a branch origin/foobar-bugfix, that the HEAD in your origin remote, which is, your GitHub fork repo. When you are pushing your local foobar-bugfix to the fork repo, it means the remote one will be overwritten. It is a little bit dangerous to overwrite a HEAD in Git repo. So, it doesn’t allow you to do this by default.

Again, it has risk, so you need to be sure what you are doing (although the commit will still stored in the repo, but without HEAD you cannot find them easily, you will need some low level operations to get them back). In this case, we just want to rebase our commits on the master and push it to our own repo, that won’t be a big problem in most cases. It appears SourceTree doesn’t support –force push, so you need to click Terminal button. Then type

git push origin foobar-bugfix -f

This will force git to push your local branch to overwrite the remote one. Let’s see

$ git push origin foobar-bugfix -f
Counting objects: 11, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (6/6), done.
Writing objects: 100% (9/9), 809 bytes, done.
Total 9 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (9/9), done.
To /Users/fangpenlin/foobar_fork
 + 178c9a4...cc5d760 foobar-bugfix -> foobar-bugfix (forced update)

Here we are

(Tips: you can click Repository and Refresh Remote Status to update your git history status on the UI)

Notice When you are the only one working on the branch, it is fine to do a force push, otherwise, be careful. For more details, please reference to http://git-scm.com/book/ch3-6.html#The-Perils-of-Rebasing

Always rebase current developing branch when there are new commits

As you know, there would be conflicts when you are doing merge or rebase. When there are more new commits in the master branch, the more likely you are going to have a tons of conflicting. So, it is a good practice to always rebase your working branch on the master when there are new commits on it.

However, sometimes, you have some works on the branch, but they are not committed, you don’t want to commit something in middle like this. But when you are doing rebase, Git won’t allow you to have change to files in the workspace. In this case, you can use Stash. Click Repository and Stash Changes.

Then you can see your stash appears in the sidebar

After you finish the rebasing, you can right click on the stash and click Apply Stash, then here you are. Your saved changes are back to working space.

Again, happy ending :D

Use interactive rebase to clean dirty commits

People make mistake. Sometimes there are small commits which are for formatting or fixing typo. And these commits are all based on your own newly submitted commits.

In this case, you might wonder would it be nice to adopt some black magic to make your stupid mistakes disappear? Well, yes, there is magic. You can use interactive rebase to squash some commits into pervious one. Now, you are at awesome branch, right click on master branch, then click Rebase children of xxx interactively. Then you will see interface like this

Select those mistake fixing commits, and click Squash with previous. And you will see multiple commits to be put altogether. And you can click Edit message to modify the commit message of the squashed commit.

Then press OK and look this!

Just like what he said in Mad Man, nothing happened!

This is actually even more powerful, you can arrange order of commits around, edit commit messages, delete specific commits. But be careful, like what Spider Man told you

Great power comes great responsibility

It is kind of history rewrite, it is fine to use it on the branch only you are working on, you should not use it on a shared branch unless you know exactly what you are doing.

The benefits of readable history

That’s it, the history is linear, another wonderful day.

Readable history doeson’t only look beautiful, it provides easy-to-understand development history. All team members in the project can follow the development progress easily. When something goes wrong, it is also easier to trace down the problem, especially when you need to fix it as soon as possible.

Recent articles:

My Beancount books are 95% automatic after 3 years
CADing and 3D printing like a software engineer, part 1 - baby step with an overengineered webcam raiser
How I discovered a 9.8 critical security vulnerability in ZeroMQ with mostly pure luck and my two cents about xz backdoor