Beyond the Basics. Guide to Deeply Understand Merge, Rebase, Squash, and Cherry Pick - When to Use Them With Real-World Examples

jodaut

José David Ureña Torres

Posted on July 28, 2023

Beyond the Basics. Guide to Deeply Understand Merge, Rebase, Squash, and Cherry Pick - When to Use Them With Real-World Examples

Table of Contents

Introduction

Git is a powerful tool that helps developers work together on software projects. Created by Linus Torvalds in 2005, it has become one of the most widely used version control systems in the software development industry.

If you're new to Git, you might find it challenging to find information about when and how to use certain commands with real-world examples. Many tutorials only give you the commands, which might be enough for experienced developers but not for beginners.

Imagine you work in a small company with a few developers and no established version control practices. It's up to you to improve the development process, but there are no senior developers to guide you. Knowing Git deeply can make a big difference for you and your colleagues.

In this article, we'll explore fundamental Git commands to combine branches effectively and how to respond to different situations. We will cover the main differences between merging and rebasing, and when should you use each of them. Also, how to use cherry pick to add specific commits into one or more branches, and why you should combine related commits into a single one to keep your branch clean. Finally, we'll discuss how to prevent some common issues when working with branches.

If you're completely new to Git, you can still read this article, but some concepts might be unclear. Feel free to come back later once you have the basics down. That being said, let's get started!

Merge or rebase, which should you use?

Choosing between merge and rebase can be tough and usually depends on your company's practices. If you have the choice, knowing the pros and cons of each method is vital for making the right decision based on your company's needs.

Git merge combines changes from one branch into another, creating a new commit with the merged changes. On the other hand, Git rebase rewrites the history of one branch onto another, adding your commits on top of the updated branch. To put it simply, it temporarily removes your commits, updates the branch from the specified one, and then adds your commits at the top of the history.

Now, let's explore each method in more detail.

Consider a main branch with the commit history below:

> git checkout main
> git log --oneline
0a9c4e8 (HEAD -> main) feature3
26c4d90 feature2
8bb8ce9 feature1
Enter fullscreen mode Exit fullscreen mode

These codes at the beginning of every line are called commit hashes or commit IDs, and they serve as unique identifiers associated to every commit created. A commit ID consists of a 40-digit long SHA-hash (a hash algorithm), which can be abbreviated to a shorter 7-digit version.

Continuing with the example, you have a ticket assigned and need to create a feature4 branch with git branch feature4 and git checkout feature4. You develop your feature, add tests, and now it’s ready to be merged to main. It has the following commit history:

> git checkout feature4
> git log --oneline
dd01e58 (HEAD -> feature4) (fix) feature4
55b4a8b feature4
0a9c4e8 feature3
26c4d90 feature2
8bb8ce9 feature1
Enter fullscreen mode Exit fullscreen mode

Therefore, you go to Github and create a new Pull Request, but you find that main was updated while you where working on your branch. In other words, the main branch now is different from the one you created your branch. It has a new commit.

> git checkout main
> git pull origin main
> git log --oneline
cf0e10a (HEAD -> main) feature5
0a9c4e8 feature3
26c4d90 feature2
8bb8ce9 feature1
Enter fullscreen mode Exit fullscreen mode

However, conflicts can arise when the destination branch has new commits that modify the same sections you were working on your branch. In such cases, you have an alternative: instead of merging feature4 into main (mainfeature4), you can update feature4 with the changes from main (mainfeature4). To achieve this, you have two options: merge or rebase.

Merge approach

If you merge main into feature4, then Git will create a new commit containing the changes from main and will add them at the top of the working tree. Let’s see it in action.

git checkout feature4
git merge main
Enter fullscreen mode Exit fullscreen mode

Git will open the editor you set up, which is typically Vim by default. You will see something like this:

Merge branch 'main' into feature4
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
Enter fullscreen mode Exit fullscreen mode

For simplicity, I'll keep this default commit message. To save it, press ESC+:wq. If you're unfamiliar with Vim commands, consider taking a brief tutorial on YouTube or your preferred platform.

Now check the commit history again:

> git log --oneline
96f49c0 (HEAD -> feature4) Merge branch 'main' into feature4
cf0e10a (main) feature5
dd01e58 (fix) feature4
55b4a8b feature4
0a9c4e8 feature3
26c4d90 feature2
8bb8ce9 feature1
Enter fullscreen mode Exit fullscreen mode

Notice that there are two new commits, but cf0e10a is the only one that was on main . Where did this other commit come from?

When you merge a branch, Git adds the commits from the source branch and creates a new commit that combines the changes from both branches. If there are no conflicts, this commit will be automatically created. However, if conflicts occur, you must resolve them and then create the commit manually.

Remember to update your local main branch before the merge. If you don’t do it, there will be nothing to merge into feature4 .

Now you can push your changes with git push origin feature4 and then merge it into main. For simplicity let’s do a local git merge to main:

> git checkout main
> git merge feature4
> git log --oneline
96f49c0 (HEAD -> main, feature4) Merge branch 'main' into feature4
cf0e10a feature5
dd01e58 (fix) feature4
55b4a8b feature4
0a9c4e8 feature3
26c4d90 feature2
8bb8ce9 feature1
Enter fullscreen mode Exit fullscreen mode

Notice this (HEAD -> main, feature4). It means that main and feature4 now have the same commit at the HEAD. In other words, both branches point to the same commit.

Rebase approach

As mentioned before, a rebase is slightly different. It basically rewrites the history of a branch, adding your commits on top of the updated branch. But, what does “rewrite” mean in this context? It means to modify the original commit history by creating new commits with different hashes and add them to the commit history, but not necessarily at the top. After that, the old commit history is replaced by the new one. Let’s see an example:

> git checkout main
> git pull origin main
> git checkout feature4
> git rebase main
> git log --oneline
0c847bd (HEAD -> feature4) (fix) feature4
b86b56d feature4
cf0e10a (main) feature5
0a9c4e8 feature3
26c4d90 feature2
8bb8ce9 feature1
Enter fullscreen mode Exit fullscreen mode

As you can see cf0e10a was added before you commits, and that commit was created after the ones from feature4. With Git rebase, your changes are added to the top of the history without extra commits like a merge. However, you may notice that the hashes of your two commits from feature4 are not the same as before, while the commits from main remain unchanged. Weird, right?

dd01e58 changed to 0c847bd

55b4a8b changed to b86b56d

This is a particularity of git rebase, as it rewrites the git history, creating new commits with your changes and discarding the originals. As a result, the new commits are not the same as the original ones. The good news is that the original dates and author information remain unchanged. This is the reason why some companies prefer not to rebase branches and instead opt for using merge, as rebasing alters the past commits.

Now that feature4 is updated, you can perform a normal merge into main. Remember, that in the real life, you would need to push your changes and create a Pull Request.

> git checkout main
> git merge feature4
> git log --oneline
0c847bd (HEAD -> main, feature4) (fix) feature4
b86b56d feature4
cf0e10a feature5
0a9c4e8 feature3
26c4d90 feature2
8bb8ce9 feature1
Enter fullscreen mode Exit fullscreen mode

This time, the commit hashes between both branches are identical, and there is no extra commit. However, it's worth noting that Git may behave differently on platforms like GitLab and GitHub, where additional commits might be added.

My personal recommendation is to avoid using rebase on main or other critical branches to preserve the original hashes for better traceability. Instead, use it for temporary branches derived from another branch, like feature branches. If your branch is going to be deleted after a merge, then you can rebase it. For main is not the case. Once your commits were added to main you should not override them.

Also, rebase will prevent a long series of commits from being scattered in the commit history and cause a lot of noise in your branch.

Cannot push my changes after a rebase

In real environments, you don't merge into main directly. In fact, you probably won't be able to merge to the main branch on GitHub either. Instead, you'll need to create a Pull Request (or Merge Request if you are using GitLab) to be able to merge your changes. Assuming you've already created your branch and pushed it to the remote repository, when you proceed to push your changes with git push origin <branch> you could get an error similar to this one:

Your branch and 'origin/<branch>' have diverged,
    ...
  (use "git pull" to merge the remote branch into yours)
nothing to commit, working tree clean
Enter fullscreen mode Exit fullscreen mode

If you force a git pull, you could lose your changes. You can solve it by executing this command: git push -f origin <branch>. Keep in mind that git push -f will override the upstream branch, so be careful with this. Later in this reading, we will explore ways to prevent lost of information.

Cherry pick

Imagine that you work for a company with three different branches under development: main, legacy-v1, and legacy-v2. The naming convention for branches is just for educational purposes.

The QA team has discovered an issue with an old feature present in all three branches under development. After fixing the issue, you need to bring the changes into all three branches. But how should you do it? Copy the code to all the other branches? That's probably not a good idea.

The next logical solution that comes to mind is to create a new Pull Request to merge these changes into the other branches. However, there is a problem: the other two branches might differ significantly from the main branch, resulting in numerous conflicts. In such cases, it could be impossible to merge them cleanly. This is where cherry-pick comes in.

This command allows you to append arbitrary commits from a branch to the HEAD of another branch, without the need for a full merge.

The syntax is the following:

git cherry-pick <commit-sha>
Enter fullscreen mode Exit fullscreen mode

First, you need to find the commit that you want to backport to the other branches. You can use git log for that.

> git log --oneline                 
8d84e7b (HEAD -> main) (fix) feature 1
222159b feature 4
32d540d feature 3
718f740 feature 2
78662d0 feature 1
Enter fullscreen mode Exit fullscreen mode

ℹ️ The --oneline flag is just minimal information. I usually recommend to use git log --stat to get full information of your commits.

Suppose that 8d84e7b (short version) or 8d84e7b6bf5ed8a4fee96dfa820e544567542785 is the commit hash that you want to backport to the other two branches.

You only have to switch to the other branch an execute the command.

> git checkout legacy-v1
> git cherry-pick 8d84e7b6bf5ed8a4fee96dfa820e544567542785
Enter fullscreen mode Exit fullscreen mode

Now the commit has been appended successfully.

> git log --oneline
0c41007 (HEAD -> legacy-v1) (fix) feature 1
9689a1c feature 5
222159b feature 4
32d540d feature 3
718f740 feature 2
78662d0 feature 1
Enter fullscreen mode Exit fullscreen mode

Repeat the process for legacy-v2.

> git checkout legacy-v2
> git cherry-pick 8d84e7b6bf5ed8a4fee96dfa820e544567542785
> git log --oneline
73deb0b (HEAD -> legacy-v2) (fix) feature 1
0b67041 feature 6
222159b feature 4
32d540d feature 3
718f740 feature 2
78662d0 feature 1
Enter fullscreen mode Exit fullscreen mode

Notice the commit hashes of legacy-v1 and legacy-v2 are different from main. This is because during the cherry-pick, Git created a new commit from 8d84e7b and appended it to the destination branch.

Is Cherry Pick a bad practice?

Some developers, especially the most purist Git users, will not recommend using this command. They will prefer a merge over a cherry-pick. But probably, they do not believe the cherry-pick command to be bad per se, but how you use it can be bad for your repository.

I like this comment that was posted on a StackOverflow question:

The SHA1 identifier of a commit identifies it not just in and of itself but also in relation to all other commits that precede it. This offers you a guarantee that the state of the repository at a given SHA1 is identical across all clones. There is (in theory) no chance that someone has done what looks like the same change but is actually corrupting or hijacking your repository. You can cherry-pick in individual changes and they are likely the same, but you have no guarantee. (As a minor secondary issue the new cherry-picked commits will take up extra space if someone else cherry-picks in the same commit again, as they will both be present in the history even if your working copies end up being identical.) (Git Cherry-pick vs Merge Workflow, 2017)

In my opinion, this command is very useful when you have to maintain different branches under development that differ a lot between them, and merging is not an option anymore. If you can merge without problems, then do it, but if not, don't worry and just use cherry pick.

However, you have to be very careful if your commit is complex, modifies different files, and causes merge conflicts. It's recommended to have good testing coverage that helps you detect quickly if something breaks.

Git squash

The last command that we are going to cover in this article is git squash. This command can help you maintain your main branch clean if you created a lot of small commits.

Imagine that your just finished to work in a feature in notice that your branch has a lot of commits with smalls changes made during the days you worked on that feature. For you they make sense, but for your colleagues, they are just random commits. Let’s see and example of a group of commits created during the development of payment feature.

b86b56d Add initial HTML structure for payment checkout page.
a5e392f Implement basic CSS styles for payment form.
cf872e2 Fix alignment issues in the payment summary section.
0c31e2d Add credit card input validation for card number and expiration date.
d6c59a7 Implement UI for selecting payment method (credit card, PayPal, etc.).
e8af221 Add CSS transitions for a smoother payment form experience.
3e987b8 Fix bug with PayPal payment option not showing up in the UI.
95c67f2 Update button styles for the payment confirmation step.
713bd64 Integrate backend API for processing payment requests.
2d344ef Update success message for completed payments.
Enter fullscreen mode Exit fullscreen mode

You probably will be agree with me that, in the long run, those small related commits are not important seen them individually. In this case you should use a git squash. This command allows to combine a group of commits in a single commit with a more meaningful name your your team. Those commits can be summarize into a single descriptive commit as follows:

b86b56d Implement new payment checkout UI with improved user experience, payment method selection, validation, and backend integration.
Enter fullscreen mode Exit fullscreen mode

It sounds better, right? It’s very common that companies try to have a single commit per feature in their main branch, because it’s easier to rollback if something goes wrong, or add tags to their new releases.

Let’s see another example.

You have two branches: main and feature4. The following is the log history of both branches:

> git checkout main
> git log --oneline
d303a7a (HEAD -> main) feature5
0a9c4e8 feature3
26c4d90 feature2
8bb8ce9 feature1
> git checkout feature4
> git log --oneline
0c847bd (HEAD -> feature4) (fix) feature4
b86b56d feature4
d303a7a (main) feature5
0a9c4e8 feature3
26c4d90 feature2
8bb8ce9 feature1
Enter fullscreen mode Exit fullscreen mode

In this example the feature4 branch only have two commits, but they could be a lot. Also, note that the last commit seems to refer to some kind of fix or improvement made to the code or suggestion you received from another coworker.

Maybe these commits messages may not be relevant in the big picture of the product. The only thing that matters is that the feature4 is ready to be merged into main. To squash these commits you can use rebase or merge.

The two branches have the following commit history:

> git checkout main
> git log --oneline
159678e (HEAD -> main) feature6
d303a7a feature5
0a9c4e8 feature3
26c4d90 feature2
8bb8ce9 feature1
> git checkout feature4
0c847bd (HEAD -> feature4) (fix) feature4
b86b56d feature4
d303a7a (main) feature5
0a9c4e8 feature3
26c4d90 feature2
8bb8ce9 feature1
Enter fullscreen mode Exit fullscreen mode

Squash with merge

This command is quite similar to the traditional merge process. The only difference is that you may add the —squash flag. This will add the changes from feature4 to the index (equivalent to if you just have written all the changes and executed git add ). Then, you have to commit the changes with a meaningful name to finalize the merge process. See the example below:

> git checkout main
> git merge --squash feature4
> git commit -m "feature4"
Enter fullscreen mode Exit fullscreen mode

If you have merge conflicts you have to solve them before committing.

After the merge, a new commit containing the other two commits will have been created as follows:

> git log --oneline
cda90a7 (HEAD -> main) feature4
159678e feature6
d303a7a feature5
0a9c4e8 feature3
26c4d90 feature2
8bb8ce9 feature1
Enter fullscreen mode Exit fullscreen mode

Notice that the commit was added after feature6 , although it was added after the feature4 last commit.

Squash with rebase

To do this you need to add a -i flag in your rebase as follow:

> git checkout feature4
> git rebase -i main
Enter fullscreen mode Exit fullscreen mode

Git will open Vim. You should see something similar to this:

pick b86b56d feature4
pick 0c847bd (fix) feature4

# Rebase 159678e..0c847bd onto 159678e (2 commands)
#
# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
# e, edit <commit> = use commit, but stop for amending
# s, squash <commit> = use commit, but meld into previous commit
# f, fixup [-C | -c] <commit> = like "squash" but keep only the previous
#                    commit's log message, unless -C is used, in which case
#                    keep only this commit's message; -c is same as -C but
#                    opens the editor
# x, exec <command> = run command (the rest of the line) using shell
# b, break = stop here (continue rebase later with 'git rebase --continue')
# d, drop <commit> = remove commit
# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
#         create a merge commit using the original merge commit's
#         message (or the oneline, if no original merge commit was
#         specified); use -c <commit> to reword the commit message
# u, update-ref <ref> = track a placeholder for the <ref> to be updated
#                       to this position in the new commits. The <ref> is
#                       updated at the end of the rebase
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#
Enter fullscreen mode Exit fullscreen mode

Don’t panic. All these lines starting with # are just instructions of what can you do with this utility. The only one that matter for us right now is the squash option, or its abbreviation s. You have to replace the word “pick” starting with the second one from the top as follows::

pick b86b56d feature4
squash 0c847bd (fix) feature4
Enter fullscreen mode Exit fullscreen mode

All the lines with squash will be combine into the previous one that has the pick word.

To save this changes press ESC + :wq . Git will open Vim again, but this name asking you for a commit message.

# This is a combination of 2 commits.
# This is the 1st commit message:

feature4

# This is the commit message #2:

(fix) feature4

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
#
# Date:      Sat Jul 15 09:11:55 2023 -0600
#
# interactive rebase in progress; onto 159678e
# Last commands done (2 commands done):
#    pick b86b56d feature4
#    squash 0c847bd (fix) feature4
# No commands remaining.
# You are currently rebasing branch 'feature4' on '159678e'.
#
# Changes to be committed:
#       new file:   file2.txt
#
Enter fullscreen mode Exit fullscreen mode

Add a meaningful name. In this case, for the purpose of this tutorial, feature4 is enough. You can delete all the lines by pressing DD or D<number-of-lines-to-delete>D .

After that, press ESC+:wq again to save the changes.

> git log --oneline
62b8724 (HEAD -> feature4) feature4
159678e (main) feature6
d303a7a feature5
0a9c4e8 feature3
26c4d90 feature2
8bb8ce9 feature1
Enter fullscreen mode Exit fullscreen mode

As you can see, Git created a new commit and added it at the top of the log history. Notice that the commit hash is not the same as the one with you specified the pick option that would contain all the commits with the squash option:

b86b56d (before the rebase)

62b8724 (after the rebase)

That is because Git cannot reuse a hash code containing different changes. In this case, Git is forced to use a different hash and save all the changes as a new commit. By requiring an initial "pick" commit, Git establishes a clear reference point for the rebase operation. This ensures that the rebase proceeds in an organized and predictable manner, without any ambiguity about the order of commits.

Now you can switch to main and perform a regular merge as follows:

> git checkout main
> git merge feature4
> git log --oneline
62b8724 (HEAD -> main, feature4) feature4
159678e feature6
d303a7a feature5
0a9c4e8 feature3
26c4d90 feature2
8bb8ce9 feature1
Enter fullscreen mode Exit fullscreen mode

As it is a regular merge, the 62b8724 commit was kept. This also means that main and feature4 now point to the same commit.

What happens if I messed up a branch when rebasing or merging?

The best way handle these situations is with prevention. If you are going to make important changes in your code base, it is recommended to do it in your local computer if possible. In that case, if you damage your branch, you can always download it again from the remote repository as follows:

> git checkout main
> git branch -D feature4 # delete the feature4 branch
> git checkout origin feature4
Enter fullscreen mode Exit fullscreen mode

Another alternative that you have is to backup the branches that will be affected before executing risky commands. Continuing with the previous example, you can backup your main and feature4 branches before any operation that can affect them irreversibly.

This option is preferred if you are planning to push your changes to your remote repository, because if you execute a git push -f origin <branch> you will override the remote branches and could lose information if you don’t know what you are doing. Backup your branches is as simple as follows and can save you a lot of headaches:

> git branch feature4-backup feature4
Enter fullscreen mode Exit fullscreen mode

By doing that, if you delete or damage accidentally your local and remote branches, you can execute this commands to restore the branch:

> git branch feature4 feature4-backup
> git push --set-upstream origin feature4
Enter fullscreen mode Exit fullscreen mode

A third alternative is to save the commit hash instead of backup the entire branch. Since the nature of Git, you can still access commits by hash even if you deleted the branch which they belonged. Consider a branch called feature8 with the following commits:

> git checkout feature8
> git log --oneline
407f3e2 (HEAD -> feature8) feature8
309129e (fix) feature7
9ec402e feature7
62b8724 feature4
159678e feature6
d303a7a feature5
0a9c4e8 feature3
26c4d90 feature2
8bb8ce9 feature1
Enter fullscreen mode Exit fullscreen mode

You can save these commits in a file you can access later. In this way, if you delete your branch you can restore from the commit. Imagine that you delete some commits by accident using this command:

> git reset --hard HEAD~3
> git push -f origin feature8
Enter fullscreen mode Exit fullscreen mode

Now the log history looks like this:

62b8724 (HEAD -> feature8) feature4
159678e feature6
d303a7a feature5
0a9c4e8 feature3
26c4d90 feature2
8bb8ce9 feature1
Enter fullscreen mode Exit fullscreen mode

Although, it seems that you lost all your work you still have a chance to restore it with the commit hash that you previously saved. Try with these commands:

git checkout feature8
git merge 407f3e2
Enter fullscreen mode Exit fullscreen mode

Now you commit history is restored.

> git log --oneline
407f3e2 (HEAD -> main) feature8
309129e (fix) feature7
9ec402e feature7
62b8724 feature4
159678e feature6
d303a7a feature5
0a9c4e8 feature3
26c4d90 feature2
8bb8ce9 feature1
Enter fullscreen mode Exit fullscreen mode

Don’t forger to push your changes to the remote repository to repair your remote branch as well.

ℹ️ There are more techniques like git reflog. However, they are not covered in this tutorial.

Conclusions

Throughout this article, we have explored essential Git commands and their practical applications. We covered the fundamental differences between merging, rebasing, cherry-picking and squashing.

It's important to choose the right approach depending on your team's practices and the specific project needs. By combining deep knowledge of Git with careful consideration of the project's requirements, you can improve significantly the quality of the development workflow. Each approach has its use cases, and knowing when to use them can save you from future problems.

This tutorial was just a collection of lessons and real-world examples that I would have love to read when I was learning about Git. Having readings like this one would have made my learning process more robust, providing more context not only on how but also when should I use all these commands that I learned.

Bibliography

Git - Book | 2023 | Git Docs

Stack Overflow | 2013 | Git cherry pick vs rebase

Stack Overflow | 2017 | Git Cherry-pick vs Merge Workflow

Atlassian Bitbucket | 2023 | Merging vs. Rebasing

Atlassian Bitbucket | 2023 | Git Cherry Pick

💖 💪 🙅 🚩
jodaut
José David Ureña Torres

Posted on July 28, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related