Blog

Introduction to Git

by -

tags:git

If you are already familiar with Git and its distributed model, you can skip this brief introduction and move on to the Git Workflow article.

What is Git?

Git is popularly known as a Version Control System (VCS), but it’s more than that. Git is a Software Configuration Management (SCM) system, because besides file version control, it also has software development process features such as integration team facilities, merging changes on files, tagging, branching, and more.

A brief history

Linus Torvalds started Git development in 2005 with some goals in mind: speed, simple design, strong support for non-linear development, fully distribution and ability to handle large projects (Chacon 5). 75 days after the beginning of its development, Git managed the release of the Linux Kernel 2.6.12 (“Git Software”).

Git = Distributed Version Control System (DVCS)

Unlike many VCSs (such as SVN) that are Centralized Version Control Systems (CVCS), Git is a Distributed Version Control System (DVCS). This is a crucial difference that sets Git apart from most VCSs (that are CVCSs).

Git’s model implies that there’s no central repository, so each client is a repository, and not only has a copy of the versioned files. Typically, each client has its own local repository, and in fact all the work can be done by this way (in a single person project). But this is not recommended because any local hard disk issue might become a huge problem. You should have at least one more repository, in another machine, typically a remote repository on a web server. For teamwork, that remote repository becomes a shared repository, acting as a facilitator for all team members. What one needs to do to keep everyone up to date is simply send (push) their changes to and get (pull) others changes from that repository. Thus the shared repository behaves similarly to a central repository, but this is very different from the centralized model, including the possibility of multiple shared repositories to coexist.

Advantages of a Distributed Model

Because each client has a full local repository, nearly every operation is done locally, which means plenty of speed and easiness. On the other hand, in a centralized model each client has only a copy of the files, so each commit, and many other operations, like consulting the change history, needs to connect to the central repository, often on a remote server, which results in connection, latency and speed issues.

Git’s Guts

To really understand how Git works and use it effectively, it’s essential to understand its basic concepts about storage and file versioning.

The way some other VCSs (e.g. SVN) handle files and versions is storing the base files and then saving the changes that were made (Chacon 6). So you have the original file and subsequent deltas for each change. Git works differently, for each file change (commit) Git stores a snapshot of all files, but it has intelligence to not duplicate unmodified files. (Chacon 6). So Git works much more like a mini filesystem.

That is the key about how Git works, and it’s what makes it better and richer than other VCSs.

Now, head to Git Workflow to learn about the three file states on Git (modified, staged and committed), and their three corresponding Git sections (working directory, staging area and Git directory).

Git Workflow
Git Basic Commands Explained
Git Branching and Merging
Git Remote and Tracking Branches
Git Tagging

Git
Pro Git Book
Version Control System (VCS) (Wikipedia)
Distributed Version Control System (DVCS) (Wikipedia)
Software Configuration Management (SCM) (Wikipedia)

Bibliography

Chacon, Scott. Pro Git. Apress, 2009.
“Git Software” Wikipedia , n.d. Web. 19 January 2011 <http://en.wikipedia.org/wiki/Git_(software)>

Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

comments powered by Disqus