The End of CVS

Or, Why CVS Has Outlived Its Usefullness
Oh, And By The Way, So Has C


OK, I know there are an awful lot of people out there who consider the very title of this article to be sacriligeous, but, well, how do I say this diplomatically -- they′re WRONG! You see, CVS is, like C and the VW microbus, a remnant of a previous age that some people just don′t seem to be able to let go of ... even when they′re long past the point where a rational person would have recognized just how poorly it′s serving their purposes. Believe it or not, the tools available to software developers have advanced in the last couple of decades, and there are tools some of us are using right now that actually make us far more productive than we could be back when CVS was a big improvement over SCCS.

Now, if you′re one of those people still using CVS and/or C (sorry, I can′t help you if it′s the VW bus), you′ve got a lot of company. And if you′re doing certain things with them, you might not even be wrong to be doing so. But if you′re in the majority of users, well, you′re just living in the past for no good reason (and plenty of bad ones). And the worst part is, you′re probably making bad engineering decissions just because your tools don′t support doing things the way you know you should.

So, what′s so terrible about CVS? Two words: directory versioning. CVS doesn′t have it, and it′s forcing the quality of a generation of software to be far lower than it should be. Now, CVS has plenty of other shortcomings as well, but that one′s been causing the most damage for a longest time. Why is directory versioning so important? Because the evolution of quality modular software depends on it, that′s why.

A long time ago, when C was a really cool language, most software, whether operating system, driver or application, was monolithic. Perhaps the most compelling reason for this was that unlike today, it was reasonable to assume that if you weren′t careful, your program might actually exhaust all available memory or be too slow on a state-of-the-art CPU. What a quaint idea, huh?

Fast forward to today, and I′m sitting here in my recliner typing on a five pound laptop computer with a multi-core 64-bit processor, four gigabytes of physical memory and (at this instant) just shy of fifty gigabytes of virtual memory space (sparsely) allocated. And what′s this astonishing piece of hardware doing? Not much, as it happens. It′s actually got 68 processes with a total of 299 threads active, and it′s using a whopping 2% of total CPU capacity. Now, I am down to only about forty gigabytes of free space on the local disk, so I suppose that if I tried really hard I might be able to run out of memory, but realistically, can I justify monolithic software architectures because I′m resource constrained? I think not.

Without the resource constraints that have traditionally been used to justify monolithic systems (I′ll steer clear of the validity of that argument), modular, object oriented (OO) systems come out as the obvious choice for most (though certainly not all) projects. While you can develop OO systems with lower-level languages like C, and yes, even Assembler, it′s usually easier, faster and more predictable to do so in a real OO language. One thing OO languages have in common is, of course, objects, and it′s generally considered good form to name source files after the object types they declare and/or define (you should have been doing this with C, too, but that′s another tangent). This tying of content to container has versioning implications, as well.

So, if my execution environment isn′t holding me back, why not just jump on the modular, object oriented bandwagon and build some really cool software? Because, if I′m still using CVS, I can′t evolve software modules the way they naturally need to. I need a version control system (VCS, also regularly referred to as SCM) that can handle that kind of development process, and CVS ain′t it.

You see, there′s a fairly well defined pattern you′ll see occurring over and over again in successful modular software development environments, and it goes something like this:

Hopefully you′ve spotted where the problem lies. If we′re using a VCS like CVS, we have a couple of options ranging from bad to worse, and no good ones.

As you can imagine, the latter approaches bring with them the problem that the operations can only be performed by a CVS administrator, and one who′s particularly well versed in the intimate details of how CVS stores files. Needless to say, the opportunity for grave errors is significant.
In practice, most people just suffer through the shortcomings of the first approach or, as alluded to back at the beginning, just avoid the mess entirely by copying some or all of the original widget′s code to the new location where they need the same functionality, thereby losing all history continuity and increasing redundancy in the source tree.

Like I said above, there are still a few good reasons to be using antiquated tools, and of course the most important is because you′re maintaining a legacy system. But really, even if you have to use outdated language tools to support, say, some specialized embedded device, migration away from CVS into something more capable is likely a pretty painless experience.
That′s because CVS has enjoyed enough popularity that you′d be hard pressed to find any VCS worth considering that doesn′t support migration from CVS. Either the authors of the software provide it, or the user community does very early on. So you really can′t use that excuse.

You can use the “but it′s free” rationalization, but since the introduction of Subversion there have been “free” alternatives that can move and rename elements, so that doesn′t hold water, either. And free to obtain by no means translates into free to live with. Consider that comment above about cutting engineering corners to accommodate your tools - you don′t need to be an economist to see what kind of damage that could do, and probably is doing, to your projects.

So come on, lose that clunker and leap ahead into the 20th century! If you′re in the vast majority for whom a centralized system makes sense, I′ll give you my opinion of where to look:

ClearCase (CC)
Yup, it′s huge in every dimension - cost to license, cost to maintain, cost in resources ... name your measure, and CC is probably the most expensive, by a wide margin. But oh, what a VCS! Sure, it has a couple of shortcomings, but overall I have no problem calling it the best version control system available, bar none! Yes, it really is that good, and if you′re working on a large project, or even a bunch of little ones, it can really be worthwhile - have fun trying to sell it to the bean counters, though.
Perforce (P4)
An order of magnitude less expensive than ClearCase, but because elements aren′t real objects, you lose a lot of the real power that CC can give you. Still, it′s a pretty capable VCS, runs on just about everything, and is much easier on the budget! It′s even free for approved open source projects. If you′ve never used CC, you might not even notice what you′re missing.
Subversion (SVN)
Infinitely less expensive than Perforce, but not quite as polished yet. The SVN team has recognized the benefits of a true OO approach to their repository, but it′s unclear when, if ever, they′ll get there. If they do, though, SVN is going to be the one to beat at almost any price, and would certainly cause a major disruption to P4′s, and maybe even CC′s, business strategy.

Lately, you′ll also hear a lot of people touting the benefits of distributed VCS. I′m not prepared to comment on any of them because I′ve yet to have a need for one and hence haven′t used them. I can′t say as I′m really sold on the concept, though - I don′t necessarily see anything overtly bad about them, they just strike me as a solution designed around a technology (P2P), rather than a technology designed around a solution, and that rarely turns out to be the best way to do things.

So there you have it, my pitch for getting the world to evolve in just one area of the software development process. I hope it serves as food for thought, or even a catalyst for action.