On the limits of TDD, and the limits of studies of TDD

Greg Wilson writes:

This painstaking study is the latest in a long line to find that test-driven development (TDD) has little or no impact on development time or code quality. Here, the authors repeated an earlier study with a couple of new wrinkles, and then blinded their data before giving it to someone else to analyze to remove any possibility of bias. The result: no significant difference between TDD and iterative test-last (ITL) development.

I think it’s really important to pay attention to studies like this. Which is why I’m so glad that Greg is out there drawing attention to empirical science being done on software engineering.

It’s also important to keep in mind that science is always limited by the questions being asked. In this case, my eye was drawn to the experimental design:

The baseline experiment utilized a single experimental object: the Bowling Scorekeeper (BSK) kata. The task required to participants to implement an API for calculating the score of a player in a bowling game. The development of a GUI was not required. The task was divided into 13 user stories of incremental difficulty, each building on the results of the previous one. An example, in terms of input and expected output, accompanied the description of each user story. An Eclipse project, containing a stub of the expected API signature (51 Java SLOC); also an example JUnit test (9 Java SLOC) was provided together with the task description.

(Emphasis mine.)

A commenter on Greg’s blog already noted that this is an exceptionally tiny example coding problem, and questioned whether results on such a small, easy-to-conceptualize program scale meaningfully to real-world software projects. I think that’s a valid criticism.

But I’m more interested in just how well-defined the problem is.

For me, perhaps the greatest value in practicing Test-Driven Development has always been getting over the blank-page brain-freeze towards the beginning of writing a software component. And how, at the same time, TDD forces me to tightly define the problem before addressing it. My TDD process has always been dominated by these questions:

  1. Am I done yet?
  2. …well, do the tests pass?
  3. …and if they do, do the tests describe a completed solution?

This discipline has done more than any other I’ve tried to keep me focused, to help me whittle down the problem statement to specific essentials, and to avoid superfluous tangents.

In the experimental design quoted above, most of that mental work has already been done.

I also quipped on Twitter:

…which may seem flippant, but I’m actually kinda serious about it. Research, understandably, tends to focus on questions that are easier to ask. But a lot of the most important questions (in my opinion) that need to be asked about software today have to do with difficult-to-measure externalities. Technical debt is one such tough-to-measure externality. But even more difficult, and more vital, to ask are questions like: how much of our developers’ happiness, wellbeing, and calm are we burning to achieve these easily-measured productivity/quality results? What state are we leaving developers in for their next project?

I’m glad research like the study cited above is happening. We need to be mindful of it. But we also need to be aware of the questions that aren’t being asked.


  1. I don’t know if we ever will achieve anything really useful using some scientific approaches to software development.

    I mean, at least in my opinion, coding is more akin to the arts than it’s to engineering, a pure mental process used to make ideas e concepts (that make sense for the author) into something that can be used/appreciated by others. We have good practices and a bunch of theory behind us? Well, so do the so called fine arts.

  2. Admittedly I don’t have the discipline to do TDD. However, I do endeavor to have tests for my code, and require them before it’s considered ‘done’. At this point in my career, I have very few ‘blank page’ moments. When I start ‘thinking’ about the problem, and its possible solutions, I’m already in ‘design mode’. To coin a phrase, I (already) see Objects and Messages. Although, occasionally the tests do help me flesh out the interfaces between my objects; I do consider myself an OO developer.

    Now, having said that, I find myself in the position of not being a good mentor when developers, with less experience, want to learn, use TDD. I need to do better in this area.

    However, you raise an interesting point about developer anxiety, and burnout.

    I feel that question belies the difficulty in obtaining empirical data. How many years of experience does the developer have? Just starting out I was very anxious (and not confident) about the code I produced. How many years has the developer practiced TDD? I liken this to riding a bike; has the developer just learned to ride a bike? have they been riding bikes for awhile? do they still require training wheels? And then I feel there are mentor-related questions: Does the developer have access to a mentor? How ‘seasoned’ is the mentor? What’s the mentor’s TDD competency level?

    Or I could be complicating the issue; there’s always that.

    Thank you, as always, Avdi.


  3. I think in general there’s two ways to approaching coding problems: build up, or break down.

    I think we can agree that TDD is the best approach to a “build up” style of creating software. Starting with the “blank page”, you define what you want via tests, then implement code to make the tests pass, and your blank page becomes the defined component.

    My brain doesn’t work that way, though. When I was a kid, I liked taking logos APART and THEN putting them together. Putting them together from new was less exciting for me. So when I approach the blank page, I tend to follow the “break down” (ITL – Iterative Test Last) approach. I put together the quickest, simplest code for the basic spec (make the blank page not blank anymore); write tests to define the spec; refactor to make the tests pass with clean code (break it down); and implement any remaining code for edge case specs.

    TDD-itsts would argue that the ITL approach can lead to less clean code, as I can write tests in a way that “protects” my existing code approach.

    However, I feel that ITL when combined with legit code reviews protects against that criticism.

    All of this is a long way of saying that testing style probably should reflect problem solving style, but that good unit tests are important no matter when they’re defined.

  4. I agree entirely with your reaction. Most of what TDD does for me is giving a framework for the mental work of clarifying & defining the problem into pieces which are (1) small enough for me to work on and (2) verifiable

  5. Not being a pure TDD-ist at all, I can feel the architectural urge of well-defined structures and code. What happens to me is that as soon as I define the interface in my mind, it goes hand-to-hand with some implementation. Often tests end up changing it completely but the very first step is just some actual answer to “how would I do that?” Followed shortly by the very second step “how do I prove me right?”

    Writing a proof for someting that does not exist yet, feels quite unnatural.

Leave a Reply

Your email address will not be published. Required fields are marked *