Distributed Revision Control Systems: Git vs. Mercurial vs. SVN
Update June, 2011: This is a pretty old post that I wrote when I first learned about distributed revision control stuff. This page keeps getting traffic though, so I should say that I've since decided to use git because of the utility of github.com and the fact that the git guys cleaned up a lot of the 'wonky' bits I wrote about below making it a no-brainer. There's more detailed pros/cons about each system out there on the web, but this isn't one of them, sorry!
...
I switched over to Subversion when I started developing again for the code to Mowser. I had been on a two year break from coding, so it took me a while to get up to speed in that area, but I was excited to start using SVN as my revision system as it had matured in that time to be a stable replacement to CVS. I like to move things around and rename them as I'm getting started with a project, but this used to be pretty painful. So much so that I would regularly "wait" a little while before committing my first draft into a repository so that I could get a structure that suited me - and even then only really if I needed to manage working with someone else. SVN has made that process so easy that I've never been happier developing code and checking changes in, even just for myself.
But a couple weeks ago I saw a presentation online given by Linus on his distributed version control system named 'git' and was quite shocked. He spends almost all of the talk railing against CVS, and by proxy, its successor SVN. In addition to the fact that a centralized storage system just doesn't work at all for Linus' needs because the Linux kernel developers are so distributed, he had many other philosophical and technical issues with the CVS/SVN way of revision control. One of which I immediately recognized and agreed with - the presence of .svn directories strewn throughout your code base. But more than that, the centralized aspect of those systems really does make developers work in a way that's counter-productive - for example many people don't want to "break the build" by checking in changes before they are ready, so they keep vast numbers of files outside the control system until they do one big check in. Using a peer based system seems to hold a lot of advantages.
So now being so enlightened, I decided to check out git and Mercurial -another big distributed system that's out there (both were created as a result of the BitKeeper debacle). Reading the documentation, there's definitely some things that I need to wrap my head around in terms of working with other people, but right off the bat I can see a lot of advantages. First, these systems are completely ad-hoc. You go into your project directory, initialize the revision system and one new directory is created in the root of your project to manage the entire tree of code. Awesomely clean. The process of starting a new project in SVN is still sort of a pain in the ass - importing, then exporting the project, etc. Right there, I'm sold on this stuff. Now I can start keeping track of any experimental code I play with right away, and not have any barriers to creating a big new project etc. Plus, my project directory structure *is* my control system, another bonus.
A little exploration turned up that Linus' git stuff is a bit wonky in a lot of ways. It's a few c programs surrounded by a slew of scripts of varying quality, and not many tools for. Mercurial, on the other hand, tackles the same problems, but is written in Python, well supported, well documented and seems to have a lot of momentum in the OSS community. Here's a video presentation of Mercurial which is good (despite not having any slides), and a free online book Distributed revision control with Mercurial. I'm learning quite a bit - Mercurial has a little integrated web server for when you want to work in a small group (say on a code sprint or a hackathon, etc.) and can also be accessed (like SVN) over SSH. And the technology itself is really great for laptops, as the repository always goes with you - no network needed. There are also other distributed systems out there as well that look good like Canonical's Bazaar, which they use to manage Ubuntu (also written in Python).
Very cool stuff, I definitely need to explore some more.
-Russ