Twitter github

Apache and Politics Over Code?

Mikeal Rogers just wrote a fascinating blog post, Apache considered harmful.

I have a lot of respect for the Apache community but I’m glad that someone is calling them out finally. The Apache community likes to pride itself on community over code but what has been happening recently regarding the move to a distributed version control system is either pure politicking or negligence in my opinion.

You would have to be under a rock if you haven’t noticed the change both distributed version control and in particular Github has brought to the open source world. Can you name any other major open source project (besides Apache) that is not on some form of distributed version control or has a concrete plan to move? No, I can’t at least off the top of my head. This is because the times have changed, open source projects are more mainstream now and they especially favor distributed forges like Github.

Let’s try to have some fun with statistics. From a recent presentation by Stephen O’Grady from Redmonk, Github’s growth is almost unbelievable…

I’m confident if he updated the excellent presentation again, it would further show the distance between Github and the other forges. Heck, even throw in Bitbucket (Hg and Git now) and Launchpad (Bzr) to see how fast they are growing compared to the others. Another statistic we can look at to further spot this trend is package statistics from Debian…

That’s impressive growth for Git but still shows that SVN is doing OK (poor darcs). It would be great to see more download statistics but I can’t think of other easy sources at the moment. We can also analyze search volume via Google Trends to see what people are searching for over time…

Clearly git (including github) and mercurial are trending upwards. I mean, one could argue that this is because git and mercurial are harder to learn so people are searching more for it, but I doubt that’s the complete story. I didn’t include cvs (famous U.S. pharmacy) or bazaar (ambiguous) because they are searched for in other contexts and I don’t know how to tweak google trends. While doing these searches I wanted to test another hypothesis of mine. From personal experience, I believe that in the corporate world, distributed version control adoption is lagging. The main reason for this line of thinking is that corporations are obviously slower than open source communities in adopting new technologies. To test this theory, I used Indeed to perform a search and see how things are going…

From the looks of it, CVS/SVN are still the dominant players with Clearcase hilariously staying somewhat constant over time. However, I’m sure this graph is going to look quite different in a couple of years as the tools around distributed version control systems mature. I also believe developers will start asking for a form of distributed version control while experiencing it in the wild (see git-svn). I was curious to see if LinkedIn had anything to help shed some more insight of what is going on in the software industry and found their LinkedIn Skills application. I couldn’t find a good way to group and compare relative skills but I found some interesting information. In terms of relative growth, git seems to be trending well…

In terms of skill size, svn is still doing well.

I was curious to see how CVS was doing also…

CVS is experiencing negative skill growth and then I noticed CMVC in the trends which reminded me of bad times and I knew it was time to stop digging for statistics.

Why do I care? Two main reasons. The first is simple and deals with my day job of facilitating open source efforts at Twitter. If you’re going to open source a new project, the fact that you simply have to use SVN at Apache is a huge detterent from even going that route. It would be easier to simply host the code at Github or a similar forge and take what lessons you need from The Apache Way. There’s a lot of tools available to help you with the infrastructure of your project (i.e., you can use Cloudbees or Travis CI to help you with continuous integration). The point here is that continuing to use SVN is not going to help Apache grow. When is the last time you heard a developer all excited about using SVN?

Another reason is that I have personal experience with this particular issue as I spent the last couple years helping the Eclipse Foundation transition towards git. It’s a large transition because there’s roughly 1000 committers and over 200 projects using a mix of CVS and SVN. On top of that, it took convincing the EGit/JGit projects to move to eclipse.org and a couple board meetings and votes to make that happen. Furthermore, the git tooling had to get up to snuff before the majority of eclipse.org projects started to adopt git since the previous generation of SCM tooling (e.g., CVS) spoiled Eclipse developers. All I’m saying is that it took a lot of work to start the transition and the eclipse community hasn’t even fully completed it yet. Just ask the PostgreSQL community how quick it was moving to Git. The key point here is that you have to start the transition soon as it’s going to take awhile for you to implement the move (especially since Apache hosts a lot of projects).

In the end, I’m a huge fan of the Apache Foundation and The Apache Way, as a lot of us have benefited and learned from Apache in some fashion. I just hope the Apache community learns to evolve or they will become less relevant in the new open source world order of distributed version control systems and the forges behind them. I take this problem to heart because I believe The Eclipse Foundation faces some of the same issues and we’re doing our best to mitigate them.

  • Ivan Zhakov

    It seems “git” means “to go” in turkish. I recommended you to change link to Google Trends to include only searches from USA (as largest possible region).

  • Ivan Zhakov

    The correct Google Trends link:
    http://tinyurl.com/cmrzq7g

  • Joakim Erdfelt

    The same level of politics exist at Eclipse when it comes to Maven the build tool, any Artifact repository system that isn’t P2, and projects that don’t use OSGi.

  • http://aniszczyk.org Chris Aniszczyk

    Thanks, I updated the link.

  • http://aniszczyk.org Chris Aniszczyk

    I agree that Eclipse has its bias towards p2. There was good reason for this at the time because Eclipse was an early adopter of OSGi and needed a build AND PROVISIONING SYSTEM that worked well with OSGi. There was nothing really available at the time since OBR was hardly used at the time and Maven was still ignoring OSGi at the moment. The litmus test for adopting a build system was trying to find something that could build the Eclipse SDK from scratch and that’s a very hard problem.

    It’s a bit unfortunate that Eclipse didn’t push certain artifacts to Maven central in the past, I think that would have helped with adoption. However, with Sonatype joining the Eclipse Foundation as a member, things have change a bit. They have pushed the Tycho project forward which integrates Maven with OSGi… this is what a lot of Eclipse projects are moving to as the defacto build system. As for having a maven repository at Eclipse, there’s a bug still open (https://bugs.eclipse.org/bugs/show_bug.cgi?id=337068) where some of the discussion hasĀ happenedĀ in the open and what is currently blocking the effort. We also have a test Nexus instance (http://maven.eclipse.org/nexus/) up at Eclipse that a few projects are currently trying out which should help mirroring to Maven central automatically.

    In the end, Eclipse has some way to go with Maven but things have greatly improved since Sonatype joined the Eclipse Foundation and helped the Tycho project mature.

  • http://twitter.com/ingorenner Ingo Renner

    I would not worry too much about it, I know the ASF _is_ transitioning, but as you acknowledged already, it’s a ton of projects. So there’s on one hand the need to move the people, and on the other hand there’s the tooling and technology. Apache CouchDB is used as a test bed and I think Phonegap will be on git, too. Just give it some time… Other than that there’s at least http://git.apache.org already :)

  • http://thomas.koch.ro/ Thomas Koch

    The Debian popcon graph is not accurate because the package name for git was git-core until april 2010. Before that the name git was used by another package.

    There is however another impressing Debian statistic which is the VCS used to maintain Debian packages: http://upsilon.cc/~zack/stuff/vcs-usage/
    So for DVCSs Git has won upon Debian Developers and I can tell that there is a steady migration from SVN to Git in Debian.

    There’s another interessting fact, which is the list of large free software communities using Git:
    Debian, Drupal, Fedora, Gnome, KDE, Linux Kernel, Perl, PHP (decided to migrate to Git), PostgreSQL, Qt, Ruby on Rails, freedesktop.org

  • http://newstechnica.com David Gerard

    They did this to OpenOffice.org already. It was already using Mercurial, and LibreOffice converted that to git (as you can do almost-losslessly) and has been going great guns.

    Apache went “Not invented here!” and told the new Apache OOo that Subversion SHALL be used.

    The ludicrous part: OOo had had a previous disastrous attempt to move to Subversion before going to Mercurial! [1] [2]

    Never mind the health of the actual projects – everything has to push Windows Subversion as well.