cvcs – Eric F. Savage

You might notice from my past few posts that I’m basically going through my entire stack of tools and re-evaluating everything. This time it’s version control.

A little history

I’ve mostly used SVN, and before that, CVS. I’ve tinkered with some of the more heavyweight ones like ClearCase, TrueChange, and Visual SourceSafe as part of consulting gigs, but only enough to know that they were skills unto themselves, and ones I didn’t especially want to let into my brain. My personal repo up till now has been SVN after finally switching over from CVS a few years ago.

Why SVN?

The short answer is, because it’s easy. The longer answer is that it’s easy to set up, it’s fairly hard to break, and it has a decent Eclipse plugin. You might notice that I didn’t mention anything about branches, or rollbacks, or speed, or centralized vs. distributed. Those things don’t really matter to me if the first three requirements aren’t satisfied.

Branches are the devil

I don’t hate branches because they were a legendary nightmare in CVS. I don’t hate branches because svn merge rarely works. I hate branches because of the mental cost they inflict on a team.

Having a team work in multiple branches is, as far as I’ve ever seen it, a sign that your team is too big or your project is too monolithic or your effective management and oversight capabilities are lacking.

There are cases where branches don’t impose such a cost, however. If there are no plans to ever merge a branch back to trunk, they are simply an experimental offshoot where a few snippets might be pulled into trunk/master, that’s fine.

If everyone is switching to a new branch, that’s also fine. At one point we had to rollback to code from over a month prior, and weren’t sure if we were going to get back to trunk or not. Everyone switched to the new branch, and luckily everyone was able to switch back to trunk a few months later. It took days to merge back, and that wasn’t SVN’s fault, that was just a lot of human time that was required to pull two very different versions together safely and not leave landmines all over the place.

Not on my resume

I don’t put any VCS software on my resume, because, while it’s absolutely important to use one, I don’t think it’s that important which one I know or use (it is a good interview question, though). They aren’t really that hard to learn, and since everyone uses them slightly differently there’s no avoiding some ramp-up time. If an employer ever has me and another candidate so close that our VCS experience is the deciding factor, please, take the other person.

Dirty little secret

I don’t actually have the command line version of svn or cvs installed on any of my workstations. Nor do I have standalone GUIs or shell integration like Tortoise. I know the command line, and use it on servers, but I do all my actual development with Eclipse’s integrated client. I’ve actually even used Eclipse to manage svn projects that were Flash or C. I just find the command line so restricting and linear for what really is a very non-linear task. The eclipse subversion plugin took years to go from bad to passable, and it’s still not as good as the CVS version, which is one reason I never grew too attached to SVN.

Why? I need to see everything that’s changed, that’s dirty. I diff every file individually before I commit. I often find changes I didn’t really need to make, and nuke them. Sometimes I find that I actually didn’t account for a situation the old code did. Many times just looking at my code in this way makes me think just different enough that I come up with a better way of doing it. I simply don’t have this visibility amidst the >>> and <<< of a command-line diff, because it’s not my native development environment.

Enter git

So, even though I’d be happy to continue using SVN, I need to see what all this git hubbub is about. It’s been around for over 5 years now, so clearly it’s not a fad. It’s also been vouched for by enough voices I respect that it deserved a shot. There is also Mercurial and Bazaar, but I haven’t seen nearly the same level of buy-in from trusted people for those.

My sideline view is that git is favored by the python people and the “ruby taliban”. Mercurial seems to be favored by the Microsoft and enterprise crowd, and bazaar is somewhere out back playing with package maintainers. Java is still mostly in SVN land, probably because it’s more mature, more corporate, and slower moving. 5 years isn’t a long time in Java years these days, so I’d bet that a high percentage of projects people are still working on are from when git was just Linus stomping his feet. The Spring/JBoss people seem to have gone the Mercurial route, while Eclipse is going git.

Git also has github, which is used by some people I know, while I don’t know anyone personally who is using Mercurial’s version, bitbucket (or even using Mercurial for that matter). So I ultimately went with what Eclipse and my friends were using over the other interests, and started with git. From what I understand the differences are slight in the early stages anyways, this was really more a matter of trying DVCS vs. VCS.

First steps

I started off with Github’s helpful handholding, which included installing msysgit. I imported my projects and used it in earnest for a few days. Once I was confident in my ability to actually get stuff up there, I dug a little deeper.

I read the Pro Git book, which I need to call special attention to because it’s really, really, good. It’s short, concise, has diagrams where you need diagrams, and ranks high in terms of how computer books should be written. If you don’t know anything about git, spend a night reading through this, and you will know plenty.

What about the secret?

So yeah, I was using the command line. Lean in closer so I can tell you why…BECAUSE THE GUIS ARE TERRIBLE. They’re not just ugly, they’re interface poisoning at a master level, think what an even more complicated Bugzilla would look like. And yes I say “they”, because there is more than one, and they’re all basically someone jamming the command line output into various GUI toolkits. Part of this is git’s design, but I know that someone will figure this out.

Git’s design allows for a huge number of permutations of workflow, which means there are a number of extra steps when comparing to something like subversion. On the command line, this doesn’t seem to hurt that much (in comparison). But GUIs don’t deal with situations like this very well. They can either be helpful and guide you down a path, or play dumb and wait for you to hold it’s hand. All of the Git guis I’ve seen so far do the latter.

Am I being a stick in the mud and saying that something as marvelous as git should be constrained to the simplicity of dumb old svn? Actually, yes. I should be able to edit some files, see a list of those files, diff them against the “real” version of the file (as in the one everyone else sees) and commit, with message. Then go home. I don’t care about SHA-1 hashes because I don’t remember them, I only need them when you need to tell me that two things are different. I don’t care about branches other than knowing which I’m in (we’ll get to this next). I don’t want to be bothered with any of this fancy information if nothing is broken (or going to break if I continue).

This isn’t actually a problem of git. This is a problem of people being indecisive when it comes to UIs. If you do everything, you fail. If you do nothing, you fail. If you do any subset of everything, you fail for some people. That’s OK, don’t worry, you can optimize as you go. Your first priority should not be exposing the power of git, it should be letting me put my code on the server so I can go home. Let me drop into all this fancy stuff with local branches and pushing tags and rebasing and such on a case-by-case basis, when I need it, and when I’m good at it.

What about the branches?

Git does branches right. I could go into more detail, but Pro Git does it so well I won’t bother, so just go read that. What git does not do very well, and I’m not sure if anything can do as long as humans are involved, is mitigate the mental cost of a distributed branch. It does make merging much more feasible, and allows for a larger number of cases where branches are not going to cost much, but they still need to be used with discipline and restraint.

The new idea that git adds is the local branch. I haven’t had a chance to use this much yet, but this is the feature that may ultimately win me over. I can look back and say “when have I ever needed a local branch?” and the answer would be “a few times, but not often”. But if I look back and ask “when would I have benefited from a local branch” and answer would be “hmm, I don’t know, but probably more often than I needed one”.

The example of the hotfix scenario (where you need to fix/test something from last week’s release and trunk/master isn’t ready) isn’t very compelling to me as an SVN user. It’s easy to make an SVN branch for something like this. I svn copy and switch it if my local copy is clean. If not, I can check out the project again, or if its a big one, I copy it over and svn switch it. Not as easy as git, but then again, I don’t generally have to make alot of hotfixes either.

The issue scenario (work one one issue per branch, merge when/if complete) is more compelling. I’d like to say that it isn’t and that I try to start and finish one issue at a time, but obviously that doesn’t happen enough. I like the fact that it’s so cheap to make a branch that I might as well just do it all the time. If I didn’t end up needing it, no harm done. If I did end up needing it, because some other issue suddenly got more important and the one I’m working needs to chill, then I’m glad it’s there.

The real win here is that nobody has to know about my branch, which means they never have to wonder what’s in it, or if its up to date. This means there is no cost to my team because I have a branch that is 2 weeks out of date. There is cost to me, but no more than having multiple versions of the project checked out, or a set of patch files sitting there waiting for me to get back to it.

One more thing

The fact that every developer has a full copy of the repository on their computer, basically for free? That’s really nice. Sure, you do backups and the chances of your computer being the last one on Earth with a copy of the repo is slim, but the peace of mind is undeniable.

The Verdict

The only reason this is really a decision at all is that git is harder to use for normal stuff. The command line can be scripted, so if someone got the GUI to the point where you start with an SVN-style workflow and deviate as needed, there really would be no argument to using SVN from what I can see.

My decision is that git is the winner here, despite that massive failing because I have faith that a combination of two things will happen. I will learn the GUIs better because it’s worth my time to do so, and someone will eventually figure out how to make it smarter and simpler.

I could not fault someone for sticking with SVN, even for a new project, because you can always import it later, but I will be starting new projects in git.

Tag: cvcs

To Git or not to Git