Distributed Development
A recent post on the scrumdevelopment yahoogroup got me thinking about the age-old problems of distributed software development. The author of the post described having a Product Owner in California, developers in CA, TX, NC, & India, and QA in India. (No mention if all of the workers in India were in the same place.) I’m not picking on this individual. I’ve heard similar stories many times.
It reminded me of something I’d read in Martin Fowler’s book, Patterns of Enterprise Architecture. On page 88, figure 7.1 shows a common but distressing architectural design of distributed objects. In the diagram, the components for Invoice, Customer, Order, and Delivery are each deployed to separate machines. Why do it this way?
- Because the vendor said you could.
- Because you think it can scale by adding more hardware to a component that’s overloaded.
Martin points out that such a design “will usually cripple performance, make the system much harder to build and deploy, or, usually, do both.”
The primary reason that the distrubution by class model doesn’t work has to do with a very fundamental fact of computers. A procedure call within a process is very, very fast. A procedure call between two separate processes is orders of magnitude slower. Make that a process running on another machine and you can add another order of magnitude or two, depending on the network topography involved.
As a result, the interface for an object to be used remotely must be different from that for an object used locally within the same process.
The same, and more, are true of human interactions. Ken Crow reports, “a study in 1977, researching the effect of distance on technical communication, found that the probability of communication rapidly decreases within the first ten meters.” The notice for the CSCW 2008 Workshop on Supporting Distributed Team Work says,
It doesn’t take much distance before a team feels the negative effects of distribution – the effectiveness of collaboration degrades rapidly with physical distance. People located closer in a building are more likely to collaborate (Kraut, Egido & Galegher 1990). Even at short distances, 3 feet vs. 20 feet, there is an effect (Sensenig & Reed 1972). A distance of 100 feet may be no better than several miles (Allen 1977).
Can your software development team stand such a drag on interpersonal communication?
That’s not to say that you can’t get anything done with a distributed group of people. A small group of motivated people can overcome huge barriers, and they will because they want to do so.
The point is that if you’re organizing work for a group of people large enough that there seem to be economies to spreading the work around the globe, then spreading the work by task or skill set is not the best form of distribution. And it’s an order of magnitude less desirable if you have any hopes of agility.
Agile software development gains much of its advantages from improved communications between the people specifying the software, the people developing the software, and the people checking that the software is as specified. These people need to communicate a lot if they are to stay in sync. If they have to write things down to communicate, it’s going to go a lot slower and the communication will happen in bigger chunks, with less opportunity for clarification. Talking daily, face to face, offers a much higher communication bandwidth. It allows for non-verbal confirmation of the spoken word, and non-verbal clues that the spoken word has been misinterpreted. Working together in a team room ups this bandwidth by a couple more orders of magnitude. You can ask a question the moment it occurs to you, and get an immediate response. You can validate your understanding of each sentence of that response, if that’s what you need.
If I have an address class, a good interface will have separate methods for getting the city, getting the state, setting the city, setting the state, and so forth. A fine-grained interface is good because it follows the general [Object Oriented] principle of lots of little pieces that can be combined and overridden in various ways to extend the design into the future.
A fine-grained interface doesn’t work well when it’s remote. When method calls are slow, you want to obtain or update the city, state, and zip in one call rather than three. The resulting interface is coarse-grained, designed not for flexibility and extendibility but for minimizing calls. Here you’ll see an interface along the lines of get-address details and update-address details. It’s much more awkward to program to, but for performance you need to have it.
See how Martin Fowler’s description of system design mirrors the communications design of a software development team? So what does he recommend?
In most cases the way to go is clustering. Put all the classes into a single process and then run multiple copies of that process on the various nodes. That way each process uses local calls to get the job done and thus does things faster. You can also use fine-grained interfaces for all the classes within the process and thus get better maintainability with a simpler programming model.
The analog for people is to create a team at each location, complete with specifiers, developers and testers. A common way of doing this when they’re all working on one system is to have feature teams. Each team works on a single feature. Of course there still may be some overlap in the code the feature teams have to modify, but this is a much smaller communication need than between developers working on the same feature. Development teams are smart, and they can figure out how to work together. Creating a continuous integration between the work of the teams can help identify problems early, when they can be fixed with minimal effort.
Give your teams a fighting chance. Make them as cohesive as possible and reduce the need for coupling between them.
You might enjoy this blog post from a few months ago: http://chrismcmahonsblog.blogspot.com/2009/12/telecommuting-policy.html
I’d be skeptical of this team’s success, not because they are distributed, but because the have a separate QA team in India. A separate QA team is a smell, the whole team should take responsibility for quality and testing, and QA folks need to be integrated into the rest of the team.
I’ve actually met people from teams this distributed, in huge companies, and they have just as much success as a lot of co-located teams I know. It all comes down to how hard the team works to maximize face-to-face communication – even if it is face-to-face via video VOIP.
Our small team has a key developer/manager in India, and we have been able to put things in place such as the Telecommuting Policy that Chris mentions, and technology such as our mobile telepresence device (see http://lisacrispin.com/wordpress/category/remote-team-members/). We are as productive as when our remote person was on site, plus other people can work remotely as needed without the team taking a productivity hit.
Having feature teams is ideal, I think, and we are trying to component-ize our software so we could do this in the future. Also, it’s important to have enough real face time so that everyone in all locations knows and trusts each other.
The late Russell Ackoff used to repeat that the key to the performance (of any kind of system) lies in the interaction of its parts, not in the action of the parts taken separately. I was remembered about this when I read your blog post. Thanks for writing it. /Tobias