Monday 30 June 2014

Revision Control and Project Culture

Version control, or simply repository control, is one of the most important parts of a software project. After all, it is in many cases used daily. No wonder, then, that version control is not only part of the project structure, but also part of its culture.

This blog entry is partly based on a report, Jämförelse: Subversion och Git, written for Init Ab, a consulting company headquartered in Stockholm.

Centralized and Distributed Version Control

A repository is the place where the source code of a program is kept. The control to a repository is organized with revision control software. This software maintains a monopoly on read and write access to the repository.

Two recently popular programs in this area are Subversion and Git. They represent very different views on version control.

Subversion is the leading program among centralized version control software. A centrally controlled repository is the "classic" way to arrange control over source code. In this system every user first copies the needed parts of the software to his or her local disk and, when done with making changes to it, commits the changed files to the central repository. For every operation, access to the repository is required.

In a decentralized (i.e. distributed) revision control software there is no absolute central repository. Instead, a new user copies the whole repository from any other existing user. Together with the current code also the history of changes is copied. Every user maintains a complete copy of the repository and therefore there is also no need for centralized backups. In practise, it is customary for a project to keep a "dummy user" account which is used for release testing, nightly builds or linked to a continuous integration system, for example Hudson.

Growing Popularity

According to recent studies by Eclipse Community Survey1 and ITJobsWatch2 in the last few years Git has become as popular as Subversion also in business world. Among Open Source hobbyist developers Git has been popular already for some time. However, as the statistics show us, Subversion hasn't actually been losing ground to Git. Subversion is the direct descendant of once hugely popular CVS, Concurrent Version System, and there is still a great number of enterprises who are running CVS and will only consider changing to Subversion.

YearGitSubversion
20092.4%57.5%
20106.8%58.3%
201112.8%51.3%
201227.6%46.0%
201336.3%37.8%
Results of the Eclipse Community Survey regarding SVN and Git usage.
YearPermanent positions:Rank:
Git SubversionGitSubversion
20121167335426391
201320492836157107
2014360532659099
ITJobsWatch: Git & Subversion.

(De)centralized Culture

I will not concentrate on technical side of revision control but rather on the cultural aspects that these two very different solutions foster.

Version control, or simply repository control is one of the most important parts handling a project or participating in one. After all, we use it daily. The program which we use to access the repository is one of our most often used tools. Therefore, when it feels like it refuses to co-operate with us, it immediately becomes a major irritation. So it must be simple, reliable and fast.

But more than a tool for programmers, version control is also a link between project leadership (maybe even middle-level management, depending on company structure) and developers and architects. It provides us with (inflexible?) boundaries to how we shape our work.

Ben Collins-Sussman, one of Subversions designers, claims that decentralized version control works badly for teams which don't consist of equally competent people. He quotes some requests3 he got when developing Subversion:

Can you guys please give Subversion on Google Code the ability to hide specific branches?
Can you guys make it possible to create open source projects that start out hidden to the world, then get revealed when they're ready?
Hi, I want to rewrite all my code from scratch, can you please wipe all the history?
Developers are humans and they have a tendency to want to work privately, in a cave, then spring "perfect" code on their community, as if no mistakes had ever been made. In a decentralized version control environment it can be too easy to "slip" into isolation, thinking that committing into your own repository has the same purpose as committing to the central repository. But this is not the case. The local copy of the repository is for the developers hourly or daily use for local backups; but the central repository is "public" so the project manager and others can see where the developer is going. The project policy could be to commit every day before finishing work, and if the central repository is connected to a continuous integration system with unit tests, errors and bad solutions will be discovered earlier. Collins-Sussman quotes Google's culture och mantra: don't run from failure - fail often, fail quickly, and learn.

On the other hand, if the team is small and every developer about at the same level, decentralized version control can foster meritocracy and friendly competing spirit. In a true decentralized version control environment (without a "centralized dummy user") changes are copied directly from one user to another so trusting the other's code becomes a necessity.

A decentralized environment is not the only way to foster meritocracy, however. The Apache Software Foundation is also known for its meritocratic structure in open source projects. They use Subversion exclusively. Project participants are divided into three groups: users who can make suggestions and bug reports, developers who submit their code but cannot commit, and committers who have write access to the repository. Anyone can become user and being a developer only requires to checkout the freely available source code from the Subversion repository. The committers' group replenishes itself from the developers' group by selecting with a common decision the ones whose submitted source code has the best quality. The GNOME Foundation, Apache Software Foundation, Mozilla Foundation, and The Document Foundation officially claim to be meritocracies.

Centralized version control favours a more structured organization, whereas decentralized can suit a self-forming or self-governing team, or hobbyist group. On the hand other, the technical know-how must be somewhat higher, especially when using Git. Git is powerful but somewhat complicated to use, more error-prone (or gives that appearance) in daily usage than its main decentralized competitors Bazaar or Mercurial, not to mention centralized Subversion.

Naturally decentralized version control can suit a well structured organization or a company, as well, but it requires stricter guidelines and processes to guide its usage which in part may nullify its benefits.

Conclusion

The question of team and organization's culture is the most important. As mentioned above, version control is a daily tool, and its users' culture will influence the way it is being used; but also the opposite: the version control tool will influence the users by favouring certain work flows and usage patterns over others.

References

1. Eclipse Community Survey Report 2013, Retrieved 2014-06-13.
2. ItJobsWatch, Retrieved 2014-06-13.
3. Brian W. Fitzpatrick and Ben Collins-Sussman, Team Geek, A Software Developer's Guide to Working Well with Others, 2012, First Edition, O'Reilly Media.