Blog

Git Branching and Merging

by -

tags:git

If you’re new to Git or want to know a little more about its basics and its distributed model, do not hesitate to read the Introduction to Git article.

Branching and merging are great features, but to be really effective their implementations must be optimal, and Git excels here. The way it implements and exposes it for us is very simple, powerful, and fast to use. But the branching concept, and the way it works on Git, may look a little complex in the beginning. In this article I’ll try to syntethize the way it works and try to give you a good explanation about it, so you can get advantage of its power.

Ok. Let’s start by explaining what is branching.

What is Branching?

“Branching means you diverge from the main line of development and continue to do work without messing with that main line.” (Chacon 47)

So it means you create a new branch to diverge from the main branch.

Why Branching?

Branching has several purposes and it’s a very important feature, but you should know how to use it in your development cycle. That’s out of the scope of this article, but to give you an example you can think that your project needs a change that can break things up, so you decide to create a new branch to work on that changes. That way you can go back to the main branch (usually master) at any time, and do any modifications, even releasing a new version, without worrying about breaking changes isolated in the other branch. Alternatively, you can create a new branch to experiment new things, etc. Sounds good? It really is.

The Master Branch and HEAD

When you run $ git init or $ git clone commands to create a new repository, Git automatically creates the master branch and points the development to it. Git has a special pointer to know what is the current branch, and it’s called HEAD. It’s different than the concept of HEAD in other VCSs. HEAD in Git means the last snapshot of the current branch.

The Current Branch

Now, it’s important to say that nearly every concept we’ve learned so far is related to the current branch. So, modified and staged files, working directory, staging area, etc, are all related to the current branch (one exception is Git directory, which is the same for all branches of the project). Each branch has those concepts in an isolated way. Another detail is that when you switch between branches, your working directory changes to reflect the files (and their versions) in the target branch.

Starting a New Repository from Scratch

(If you don’t know the basic commands please read Git Basic Commands Explained article.)

To illustrate, let’s imagine a scenario: you just started a new repository, have made some changes and three commits in master branch:

Git Branching - Image 01
Figure 1. Git repository with only one branch (master) pointing to it (HEAD), and with only three commits ("f30ab" is the last commit). This image was taken from Pro Git book and edited (respecting its Creative Commons License).

So, to Git, what really is a branch? Just a lightweight movable pointer to one of its commits (Chacon 48). Every time you commit, the current branch moves forward automatically to that commit. And every commit has one parent commit (except the first one), represented by the arrows on Figure 1.

When you create a new branch, it starts pointing to the last commit of the current branch, i.e., the branch you are in at the moment you create the new branch. So, immediately after creating a new branch, it’s pointing to the same commit of the branch you were in (which means the working directory of the new branch will be the same as the branch you were in).

Creating a New Branch

Let’s create a new branch (testing), change the pointer to it, make some changes and commit.

To be able to switch between branches you cannot have uncommited changes in your working directory or stagging area.

To create a new branch you run:

$ git branch testing

But that doesn’t point to it (HEAD pointer). So you need to run:

$ git checkout testing

After that, you are in the testing branch. So, let’s imagine you made some changes and commited. This is the repository state:

Git Branching - Image 02
Figure 2. Git repository with two branchs (master and testing) pointing to testing (HEAD), and with three commits in master and one in testing. This image was taken from Pro Git book (respecting its Creative Commons License).

Merging

Usually, at some point, you’ll want to merge two branches if you was successful in your changes / improvements. So you can get back to the original source of your project (e.g. master branch). The way Git handles merging is very nice too, but it depends on the state of your files. Next you’ll see those possibilities.

Fast Forward Merge

In the current state of our repository, if you want to merge both branches, it would be the simplest kind of merge that can exist. Because you didn’t make any changes in master branch, running merge command just moves forward master to point to the last testing commit, which in this case is the next commit, but you could have made n commits. To merge branches, you have to go to the branch that you want to merge into (master in this case, which will receive changes from testing) and run the merge command:


$ git checkout master
$ git merge testing

You’ll notice a “Fast forward” feedback, indicating a simple forward. After that, those two branches point to a common commit again (as right after you created the testing branch). At this point you could delete the testing branch with no problem, by running:

$ git branch -d testing

That’s the simplest merge and will never arise a conflict. But you want to be prepared for the real world, right? So you can be sure that conflicts will arise at some point! ;-)

Three-Way Merge

To simulate another possible merge, a more complex one, let’s forget the merge above. Let’s pretend we didn’t merge, ok? Good…

Let’s go back to master branch ($ git checkout master) (remember: when you switch between branches, your working directory changes too). Let’s then make a change and commit. So this is the repository state now:

Git Branching - Image 03
Figure 3. The last two commits (one on each branch) are isolated in separate branches. This image was taken from Pro Git book (respecting its Creative Commons License).

Now we have isolated commits in both branches. Trying to merge now will give some work to Git. If no conflicts arise (i.e. if there are no changes in the same part of the same files) Git can do everything automatically.

So, in this scenario, what will Git do after we run merge command?

Try to merge!

Git does a simple three-way merge, using the last snapshot (commit) of each branch plus the common ancestor of both (“f30ab” in Figure 3). To do that, Git creates a new snapshot (commit) that is refered to as a merge commit, and it’s special because it has more than one parent (Chacon 57):

Git Branching - Image 04
Figure 4. Git automatically creates a new commit containing merged changes. Note that the new commit ("09b8e") has two parents. This image was taken from the Pro Git book and edited (respecting its Creative Commons License).

You’ll notice a “Merge made by recursive.” feedback, indicating a more complex merging. After that, you can delete the testing branch if you don’t plan working on it in the future.

Maybe you’re thinking now: “What about conflicts?”.

Resolving Conflicts

Everything is good when things go right. But don’t worry, the way Git handles conflicts is also very elegant.

Let’s suppose you’ve changed the same lines of a text file in both branches and tried to merge. Git gives you a feedback like this:


Auto-merging file-name
CONFLICT (content): Merge conflict in file-name
Automatic merge failed; fix conflicts and then commit the result.

If you run $ git status now, Git will show you an Unmerged paths section listing files that are in conflict state. And at this time Git doesn’t created a new merge commit.

But this kind of conflict should be simple to fix. If you open the file, you should see something like this:


<<<<<<< HEAD
This section was changed by master branch.
=======
This section was changed by testing branch.
>>>>>>> testing

Everything that is below << HEAD and above == is the HEAD version of the file (the last snapshot of the current branch when you run merge), in this case, master version. And everything above >> testing and below == is the version of the file on the other branch (testing, in this case).

To fix the conflict you need to remove those Git markers, manually. Also, if you prefer you can run $ git mergetool to open a visual merge tool.

After removing markers, you need to tell Git that the conflict was fixed. You do that running add command:

$ git add file-name

Now you can commit your changes, and the merge is done.

Conclusion

As we could see, the way Git handles branching and merging is very elegant and fast, you do everything on your local repository.

In the same way you’ve created the testing branch, you can create several other branches, in any direction. The master branch can have any number of child branches, and a child branch (like testing in this example) can also have any number of child branches, and so on… To merge them all, the common approach is to merge a branch into its parent one.

So, that’s it. Now you better understand about local branches. In the next article I’ll talk about remote branches and tracking branches. Don’t worry, now that you understand the basics about branching, it should be simple to understand the rest of the concept.

Git Tagging
Git Basic Commands Explained
Git Workflow
Introduction to Git

Git
Pro Git Book

Bibliography

Chacon, Scott. Pro Git. Apress, 2009.

comments powered by Disqus