TDD, Options, and Culture
The background for this post: Gil Zilberfeld posted a link to a blog post by Joshua Kerievsky. I had some questions and Josh’s response, to me, seemed to change the story a bit. As I suspected, the story is hypothetical, and changes to support the intended conclusion. This always makes it very slippery to talk about a story. The details matter. And in this case, the change overwrites the original, without any indication of what has changed. That makes it even harder to have a conversation about it.
I’ll give a synopsis of the story here, since it may change again. In brief,
- “A customer reports a defect. It’s inhibiting them from getting some important work done.”
- David finds the defect and test-drives a fix, including refactoring to re-design the code around the defect. He spends two hours doing so.
- In a parallel universe, Sally finds the same defect, fixes it without an automated test, hand-tests the fix, pronounces it good and pushes it to production. There’s no indication of how long this took, but presumably it’s intended to be less than two hours.
What’s different between the versions?
The second version of the story talks up Sally’s programming skill and experience. In the first version, the code in question has no tests, but “isn’t completely without tests, but the test coverage isn’t great” in the second version. The second version also has a lot more justification for Sally’s actions and assurances that this isn’t her “preferred approach to programming.”
Both versions ask “which is better.”
The Rule of Three
The question is a false dichotomy. Is one better than the other? Better for whom, when, and in what way? And why these two choices?
Jerry Weinberg often phrased the Rule of Three this way: “If you can’t think of a third option, you haven’t thought about it enough.” Virginia Satir said “One option is a trap, two is a dilemma, and three is a choice.” I’ve often found that, when stuck on a dilemma, when I think of a third option then many others quickly follow. It must be exceedingly rare that there are only two options.
Personally, I wouldn’t do what either David or Sally did. I would start with writing a test for the scenario reported by the customer. I would use that test to help me find the issue, and to validate the fix afterwards. The second version of the story says, “The code isn’t completely without tests, but the test coverage isn’t great.” Since there are already some tests, adding another one won’t take long at all. I might look at the existing tests and list the other edge cases that the current tests overlook. I might write those tests now, too, just to characterize the current behavior of the system before modifying it. That’s a safety measure that neither David nor Sally are mentioned trying.
Having a failing test, and presuming that the defect in “old code” is as obvious to me as it is to David and Sally, I would fix it. Probably I would first write a failing test around the defect covering as little code as possible, especially if my first failing test required deploying to a server and testing through a user interface. This smaller scope test will let me work faster, and provides safety if I make a typo or have overlooked something in unfamiliar code.
Fixing the defect and seeing the tests pass tells me I’ve corrected the problem that the customer reported. My characterization tests tell me that I haven’t changed nearby behavior in the process. If the rest of the test suite passes, I know as much about the success of my fix as I can know in a short period of time.
How much slower was that then Sally’s approach? Not much, if any, I’d guess. I don’t know Sally’s manual testing skill, but I generally find that I can set up test conditions quicker and more accurately using code than I can manually. And it was certainly safer, having checked that existing behavior that I didn’t want to change had been left undamaged.
Of course, after satisfying this customer I would likely want to tidy up the code and tests as mentioned in Josh’s story. I might also want to start a conversation about alternative conditions where the behavior described in my characterization tests didn’t seem quite right to me.
The sense of urgency, differentiating between an approach that takes two hours and one that takes some unspecified amount of time less than that, is never explained. Sure, this defect is “inhibiting some important work,” but that doesn’t necessarily imply urgency. Perhaps it means the work is taking longer than desired. Perhaps it means they’ll have to shift to some other important work until this is fixed. If it were so important as to be a threat to life or the continued existence of the company, perhaps it would have been better to take more care all along this systems lifecycle.
ASAP or “as soon as possible” usually means “as soon as reasonably possible.” Rarely is that time measured to a high precision. In many situations getting a fix in a day or two is considered fast, and in a couple hours would be considered a miracle. If you need fixes to be within minutes, I suggest you take greater care in making the code highly tested, highly testable, easily understood, and, in general, highly maintainable.
The need for making changes safely is mentioned in the story. In fact, it suggests “always ensuring I work as safely as possible.” Even here, “as safely as possible” is no more absolute than “as soon as possible.” As I indicated above, I don’t think Sally’s approach is “as safely as possible” given that I can offer a safer method. Josh’s posting takes Sally’s approach as “safe enough,” even without checking that other previously untested behavior in this particular bit of code isn’t inadvertently changed. Both “safe enough” and “fast enough” are judgement calls. It seems that Sally has focused on “fast” while David has focused on “safe.” I would prefer programmers who are neither like David nor Sally, but who look at the options and make judgements. Neither rigidly following a particular process nor jumping straight to action when there’s a hiccup seem a preferred response to me.
The culture of a group is sometimes described as “the stories we tell ourselves about ourselves.” I like that, but it neglects the behavior of the group. What we actually do is also part of our culture.
I once worked for a company that had a sterling reputation. Friends in other companies said what a good job they did. When I joined the company, the new employee orientation spent a great amount of effort emphasizing doing great work and supporting fellow employees. After orientation, though, I kept running into situations where people said “we don’t usually work this way, but just this one time”¦” They had great pride in their software development skills and processes, but I came to understand that it amounted to “we know so much about software development that we can get by without doing what we say is important.”
On one project we were determined to do code reviews, but cranking out functionality took higher priority. By the time we sat down to review code, months later, we had a stack of printouts that was at least three feet tall, and a day to do it. There was no way to review that code.
The quality of the code was dependent on individual programmers being really smart, like Sally, and able to write code that didn’t affect existing functionality. The programmers were generally really smart, but not perfect. When I had to fix a bug in someone else’s code, I saw how less-than-perfect it might be.
So culture is not just what we say about ourselves, but also how we behave when the chips are down. Josh’s story says TDD is Sally’s preferred approach to programming, but when pushed, she quickly abandons it. That makes me think it’s not her preferred approach. She prefers leaning on her brilliance and making the change. In this story, she apparently got lucky and didn’t break anything else.
I’ve been in a situation a couple of notable times where I was asked to write code under pressure of both time and consequences. In one case, I had a few weeks, and I doggedly worked both test-driven and aggressively refactored to remove duplication and drive the functionality close to where it was needed. I did so not just because TDD was my preferred approach, but because I was not convinced that I could accurately change the functionality in all the required places without doing so. There were so many places that were doing the same thing in slightly different ways. In a few places, I found instances where they were missing some of the flows of other places—bugs that just hadn’t been noticed.
In another time and place my manager asked me at 5:00 PM to add a new feature to a system that was going to system test and production that evening. “This is really urgent, so relax your standards about testing to make sure this gets out.” I didn’t argue, but I still test-drove the new functionality. I didn’t do it out of purity, but because that was the fastest way that I could make the change safely.
What I did skip was GUI level automated tests. I had been doing back-end work, so I hadn’t set up to write test scenarios through the GUI. The person who had been working at the GUI level didn’t write automated tests. I figured that it would take me a couple hours to set up the infrastructure to test via the GUI, so I manually checked that the GUI was wired correctly, after fully testing all of the business functionality via back-end automated tests. I made a judgement call that took into account both speed and safety.
What I didn’t do was abandon ways that I knew gave me superior results just because I was under pressure. Instead, I focused on doing them even better, more efficiently and more helpfully. If you drop a practice out of pressure, it may not really be part of your development culture. It may just be window dressing.
It may be that Sally is a “rockstar developer” and can get by hacking things in. I’m just a competent developer, and I can make errors in the smallest of changes. I’m perhaps a competent tester, but it generally takes me longer to test things by hand than with automation. It’s also easier for me to overlook something. Using TDD enables me to do “rockstar” work without special talents. I go faster because I don’t have to second-guess anything I type until a test unexpectedly fails (or unexpectedly passes). I can code about as fast as I can type until I need to slow down and take a closer look. And when I make an error, it’s right there in front of me, and I can quickly correct it. I go for both speed and safety by leaning on a technique that supports me.