Richard Bucker

The monolithic code tree is a data cancer

Posted at — Jul 5, 2012

I’ve been a fan of Google’s approach of a single code tree for quite some time. I thought it was a pretty good idea because there could be some unique benefits. (a) programmers might stray into folders that they might otherwise have access too and learn or contribute something unexpected (b) reduce the potential that code would get duplicated; for instance (c) the code librarian’s maintenance cycle is more manageable.However there is one serious side effect to having this sort of intra company openness.For my company I decided to create a single source tree was a good idea supporting (c).  Since I was the librarian I did not want to be managing tre after tree after tree. (Even though bitbucket.org offered unlimited private repositories).The issue is actually much bigger than this. Google turns over it’s codebase several times a year. Also, they share a lot of code between projects. And while this works for Google it’s not the way most businesses run. With the exception of Google’s spiders Google provides a read-only service to it’s users. Whereas most businesses are read/write and the write must be consistent and reproducible. Which is the antithesis of the “Google Way”.So as I look at my FlaFreeIT repository I realize that there are projects that I will not ever update or repair. Clients that have long since departed. It might simply be better to have separate trees. If not because of local storage, clone latency and performance, risk of leaking or losing code. Any number of other justifications.One last thought. When googlers make their presentations it’s likely that these are on Google Apps. And that makes sense. But do they really have separate systems for presentations and development. And if so, do they really keep all that source code on each of their laptops?  That’s got to be painful!