A Git Tutorial

Git has taken the open source world by storm. Both Catalyst-Runtime and DBIx::Class have switched to git for the main repositories. There are still a few repositories in subversion, but they will be converted in due time.

If you still are unfamiliar with git, this tutorial will show you the common idioms that we use for Catalyst and DBIx::Class development, so that you can jump right in.

What is Git?

Git is a decentralized revision control software suite. Decentralized means that, unlike in subversion for example, you can operate on a git repository without the central server, by using your local filesystem or another server.

Git is very powerful, much more so than tools like subversion, it's a whole new way of working with revision control software.

Where can I host a repository?

You need no server to start working with git. You can even clone a repository you started on your workstation to another machine via ssh, although you may want to set up gitosis or gitolite on a server to manage your git repositories for work or personal needs.

Starting a new git project is simply a matter of:

    mkdir project
    cd project
    git init
    touch README
    git add README
    git commit -a -m'first commit'

For open source projects, github.com is a popular choice. Once you register and create a repository, the website will guide you through setting it up and pushing it to the github servers. You may also find yourself needing to fork a repository on github on occasion; pull requests have become a very popular way to submit patches to projects.

For Catalyst or DBIx::Class related CPAN modules, we will be happy to host your repository on the project servers, provided by Shadowcat. Join #catalyst-dev or #dbix-class on irc.perl.org and ask to be set up, and someone will help get you started.

Cloning a Repository

Making a local copy of a repository to work with is called cloning in git, it is different from checking out in subversion in that you get the whole repository with all of its history, not just the latest revision, so you can do all possible operations on your local copy. Cloning is very fast, unlike tools such as SVK where the initial import could take many hours, cloning usually only takes a few seconds or a minute.

To clone Catalyst-Runtime you would do this:

    git clone git://git.shadowcat.co.uk/catagits/Catalyst-Runtime.git cat-runtime

Usually you will access your projects via ssh, if you have commit access to them, e.g.:

    git clone catagits@git.shadowcat.co.uk:Catalyst-Runtime.git cat-runtime

If you started a git project on another computer and want to clone it on this one, the command would be something like this:

    git clone rkitover@hlagh:src/catalyst/runtime cat-runtime

if the project is in src/catalyst/runtime under the home directory on the remote host. You can then push via ssh to that machine to sync your changes.

Editing and Committing

Now that you've created or cloned a repository, how do you make changes to it? To make changes to master, which is known as the trunk in subversion, the central branch that all projects have, it is simply a matter of committing your changes and pushing your changes up to the server (if you're using one.) I will discuss branches in a later section.

Git gives you a view onto a commit as the files in your project directory, the files represent that commit. All git metadata files are in the subdirectory .git under the project directory. There is only one .git directory, not one in every subdirectory like in tools such as subversion. All other files are part of your project.

The current checked out commit is called HEAD. You do not generally make a new directory to view another commit or branch, you simply use commands such as git checkout to switch between them.

Files that you want to be part of your project need to be added with git add <file> and they will be added on the next commit. You can see which files you haven't added yet with git status.

Most people also like adding a .gitignore file to their project, the format of the file is one entry per line, comments starting with #, in shell glob format, for example:

    *.o
    Thumbs.db
    Icon?

This file should be committed to the project. If you don't want to commit a .gitignore to your project, you can use .git/info/exclude.

To see a diff of your edits so far, use the very handy command git diff HEAD.

Once you are ready to make a commit, the command:

    git commit -a -m'commit message'

will commit all of your changes. git push will push your changes up to the server, if you're using one.

Surgical Commits

By default, git does not commit all of your changes, only the changes that are staged. The -a flag to git commit will instead commit all changes to HEAD, which is what most new users use.

Git pros use git add <file> to stage changes for commits to have more control over what content will go into which commit, often with the -p option. git add -p <file> will ask you whether you want to stage each diff hunk in the changes for <file> for the next commit.

Once you've staged all the changes you want to go into the commit, the command:

    git commit -m'commit message'

will commit them.

On Commit Messages

See this excellent article on the proper etiquette regarding the writing of git commit messages: http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html.

When you use the command git commit, it will take you into your editor, usually vim, to write your commit message.

The first line is the summary line, and should be no more than 50 characters. Then there is a blank line, and the following is the long description of the commit (with more blank lines as appropriate.) The long description should be wrapped at 72 characters. You may omit the long description for simple commits where the summary line is sufficient.

If you are using vim, I highly recommend making the file ~/.vim/ftplugin/gitcommit.vim with the contents:

    setlocal tw=72

so that you get the wrapping automatically. On the summary line, vim will highlight the first 50 characters, so you will see if you go over.

The language of the commit message should be in the imperative mood, this means "change", not "changes" or "changed." For example:

    run coffeemaker at 5am not 6am

    Change the cronjob that runs the coffeemaker to brew coffee to run at
    5am instead of 6am because many of the ops people come in earlier these
    days.

Working with a Remote Repository

To pull in changes from a remote, use the command git fetch --prune. Then fast forward your master to include the new changes with git rebase origin/master.

To push your changes to the remote, use git push or git push origin HEAD.

If the server rejects your push, it means someone has pushed new changes to the remote since your last fetch, and you need to rebase your commits on top of theirs. To do this run:

    git fetch --prune
    git rebase origin/master

then your push should be successful.

The Git Log

The git log command will show you the log of commits to your current branch and their SHA hashes. To see the diff for a specific commit use git show <SHA>, to see the diff between a certain commit and HEAD do git diff <SHA>, between two commits git diff <SHA1> <SHA2>. To see just your last commit, do git diff HEAD~1 and so on.

Tagging

To make a tag, use the command git tag -a -m'release 0.01' v0.01 most people use the v on tags, I don't personally like it so I omit it. To push your new tag to the remote, use git push --tags.

To see all the tags, use the command git tag, to see the tag messages use git tag -n1. To fetch the most up to date list of tags from the remote, use git fetch --tags, this is done automatically for you if you use git fetch --prune.

Branching

With the exception of trivial changes, exploratory work for new features or bug fixes should be done in a branch. Unlike other revision control software, branches in git are very fast, inexpensive and powerful.

To make a new branch, use the command git checkout -b <branch-name>. Branch names can have slashes in them. Many projects like to name branches topic/foo.

To see a list of your local branches, use the command git branch, to see all branches including remote branches, use git branch -a.

To switch branches, including to the master branch, use git checkout <branch-name>. You cannot switch branches if you have uncommitted changes.

To move uncommitted changes between branches, use the git stash command to save them, switch branches, then the git stash pop command load them back into your working copy.

Once you have some commits in a branch, your first push should be with the command git push origin HEAD -u, this is so that git pull will work. Subsequent pushes can be with git push.

git fetch --prune will show you what is happening with remote branches, which ones have been deleted and which ones have been rebased (these will show forced update.)

To update your branch if it hasn't been rebased upstream, do a git fetch --prune and a git rebase origin/branch-name.

If the branch HAS been rebased upstream (as indicated by a forced update in git fetch --prune) then you need to do a git pull --rebase. If you have no commits and just want to update your local branch with the upstream rebased branch, just do a git reset --hard origin/<branch-name>.

Merging Changes from master

To update your branch to the current master, do git rebase origin/master. You may need to resolve conflicts (see below.)

After your branch has been updated thus, you need to push it to the server with git push -f origin HEAD (but check git fetch --prune to make sure no one has pushed to it.)

Resolving Conflicts

With git rebase you will often have to resolve conflicts. This is similar to how you resolve conflicts in subversion and other version control systems; git will tell you that there is a conflict and which files you have to edit. After you have edited the files, git add them and then issue the command git rebase --continue.

Sometimes you may wish to entirely skip one of your commits during conflict resolution, to do this use the command git rebase --skip, BE VERY VERY CAREFUL with this command. If you accidentally delete a commit this way that you later decide you needed, you can use git reflog and git checkout to recover it.

If in the middle of a rebase you decide the whole thing was a bad idea, you can back out with git rebase --abort.

Rewriting History, the Interactive Rebase

Often you will want to clean up your branch history into a set of logical commits, or just one commit, before merging it to master. To do this, use the command git rebase -i origin/master. Your editor will come up with the list of commits, instructions are at the bottom of the file. Place the commands you want to invoke on a specific commit by changing the word pick to the command. Here you can squash commits into the previous ones with f, reword commit messages with r and move commits around. When you're done, save the file and exit. You may have to resolve conflicts.

NEVER rewrite history for master. This is why working in branches is so important, so that once you merge a branch to master you really mean it.

Merging your Branch

Once you are happy with the work you (and possibly others) have done on a branch, you will want to merge it into master. To do this, issue the command:

    git merge --ff-only <branch-name>

then do a git push to push the new master to the remote.

Delete your local branch with git branch -D <branch-name>. To delete the remote branch, do git push origin :refs/heads/<branch-name>.

There is Much More

I have merely scratched the surface of the capabilities of git, and I am by no means an expert. Hopefully however, I have provided all the information you need to get started with git, and to contribute to our projects!

AUTHOR

Caelum: Rafael Kitover <rkitover@cpan.org>