Making a Mockery of TDD

I made this gnomic remark on Twitter the other day:

To be a successful mockist, you must dislike mocks.

A lot of people re-tweeted it, so I guess I’m not completely alone in thinking this way.

I should back up a bit. A “mock object”, or “mock”, is a specific kind of test double. It is an object used in a Unit Test which stands in for another object, and carries certain expectations about how what methods will be called, and how they will be called. If the expectations are not met, the test fails. By contrast, other test doubles, such as stubs objects, make no assertions about which methods will be called. This article is specifically about mock objects and mocked methods, which make an assertion about when, how many times, and with what arguments certain collaborator methods will be called.

The term “Mockist” refers to those programmers who use mock objects in their unit tests. There is another camp of programmers, called “Classicist” by Martin Fowler, who eschew mock objects entirely in their tests.

Actually, if you haven’t read Fowler’s article on the “mockist”/”classicist” divide, go read it and then come back. There’s really no point in reading this post any further unless you have that as background. Fowler’s article is an excellent breakdown of the origin of the two camps, as well as a pretty objective evaluation of the pros and cons of both approaches to writing unit tests. Seriously, go read it right now.

I’m a Mockist. I write tests that make assertions about object interactions, not just about static outcomes. I’ve been doing it for a while, and I’m pretty happy with the results.

However, it’s a technique that’s very easy misuse. Applied poorly, it can lead to testing suites which are so brittle they have to be either extensively rewritten or thrown out entirely when it comes time to change functionality or re-architect the code.

Mock Objects are a Test Smell

When I realize that I’m using a mock where I shouldn’t be, it’s often because I haven’t spent enough time on my test tooling, and as a result I don’t have a good way to make assertions about method output.

For instance, the other day I was building a Presenter for some Rails code. The presenter held references to both a domain model object (an ActiveRecord instance) and the Rails “template” object which makes various HTML-building methods available. The presenter’s job was to take model data and format it for display.

One of the first tests I wrote went something like this:

[gist id=1195772 file=badmock.rb]

This test asserts that the presenter will call the button_to helper to generate some HTML.

I’ve been making an effort to write isolated tests which can be run without loading the full Rails infrastructure. This test completely isolates the presenter functionality from that of its collaborators. It doesn’t require any actual HTML to be generated, it just asserts that the presenter must use the button_to method and return the result.

The test felt wrong as soon as I wrote it. It was clearly brittle. I could look ahead and imagine, as the requirements for the HTML became more complex, how the mock object would have to grow more and more mock methods in order to keep up. The tests would become, in effect, an exact replica of the method implementation.

I decided to change up my spec setup a bit in order to pass in a template object which included the actual Rails tag helpers. Then I included the Capybara spec matchers for making assertions about HTML. In the end, the spec looked like this:

[gist id=1195772 file=have_button.rb]

The test is slightly less isolated now, but for a class which is all about generating strings, I think this style of test more suitable. Note that what I didn’t do was graduate the tests to the level of an integration or acceptance test. Instead, I added the minimum coupling necessary to get rid of the suspect mock object.

In my experience, well-isolated tests lead to neatly decoupled code. Not to mention that they run really fast. Rather than tie my presenter tests to the entire Rails stack, I let the mock object pain guide me to the one area where a little relaxation in my isolation discipline would lead to a big gain in productivity.

Mock Objects are a Code Smell

In the preceding case, the mock object was a test smell. But mock objects are more often smelly because they are telling you something about the system under test.

A Code Smell is defined as “a hint that something has gone wrong somewhere in your code”. A hint, not a certainty. Whenever I use a mock object to assert that a certain method call must be made, a little voice in my head says “should I really be using a mock object here?”. When a test has more than one or two mocked method calls, that little voice gets a lot bigger.

In my experience, multiple mocked methods often indicate that either a) the method being tested is trying to do too much; or b) the method being tested has high structural coupling. As I wrote in another testing article, mocks are like the hydrogen peroxide of programming. They “fizz up” when they encounter highly-coupled code. Show me a test with a lot of mocked method calls, and I’ll show you a class under test which violates the law of Demeter. And which is, consequently, a liability to future code changes.

Mocks act as a canary in coal mine: they are an early warning system that your code is beginning to depart from the path of small methods, each having a single responsibility, and each interacting with a very small set of collaborators.

Mock objects are for Design

Which brings me to my next point about using mock objects: mock objects are all about design. They were developed hand-in-hand with the “Behavior Driven Development” movement, which sought to re-emphasise the design component of TDD. Here are the inventors of mock objects, talking about them in the intro to an OOPSLA 2004 paper on using mocks:

Mock Objects is an extension to Test-Driven Development that supports good Object-Oriented design by guiding the discovery of a coherent system of types within a code base. It turns out to be less interesting as a technique for isolating tests from third-party libraries than is widely thought.

Guiding design isn’t just a side-benefit of using mocks; it’s the primary reason to use them.

Which means that if you aren’t using tests to drive your design, there’s little point to using mock objects.  I suspect this is the root of a lot of the the arguments over whether to use mocks or not; if your primary reason for writing unit tests is verification and avoiding regressions, mock objects aren’t going to buy you much. And people talking about how useful they are aren’t going to make a lot of sense.

Mock objects also make more sense in an outside-in model of test-driven development. They enable developers to ferret out the needed interfaces of objects that don’t exist yet. Many if not most of the mocks I write are mocks of classes and methods I have yet to write. The mocks I write while TDDing one layer of the design gives me the clues I need to work out the necessary responsibilities of the next layer down.

(By the way, that OOPSLA paper is full of good advice on how to use mock objects without shooting yourself in the foot. The whole thing is worth reading)

Mock objects are for Commands

This is a distinction that is just starting to become clear to me. You may be familiar with notion of  “command/query separation“. This is when you make an effort to divide all methods into either command or query methods. Command methods cause something to change, either internally or externally to the receiver, and return no value. Conversely, query methods change nothing, (i.e. they have no side-effects), and are used solely for their return values.

As a general rule, I think mock objects are mainly useful when describing command methods. Their use is suspect for testing query methods. As a corollary to this observation, mocked methods with defined return values are often (though not always) a warning sign. I find most of my test doubles fall into one of two categories: stubs with no behavior, just a return value; and mocks that expect a certain method call, but define no return value (no ‘and_return()’, in RSpec terms).

This distinction has the side effect of encouraging you, as a user of mock objects, towards a design that has a strong delineation between queries and commands. This is widely regarded in OO circles as a Good Thing.

Conclusion

This has been a brain dump of some of my current thinking on mock objects. As such, it’s probably not as clear or well organized as it could be. To sum up: mock objects are helpful, but they are as often useful for drawing attention to problematic tests and code as they are for specifying object interactions. Mock objects are only a meaningful exercise when tests are being used to drive design. And finally, mock objects tend to make the most sense when specifying pure command methods.

I hope you find some of these observations helpful in your own TDD practice.

Enhanced by Zemanta

16 comments

  1. Great post, mock roles not objects is the key to understanding it and agree with the conclusion. The guys also wrote the Growing Object Oriented Software Guided by Tests. I haven’t read it yet but everyone raves about it so it must be good.

  2. Mocks as a vehicle for outside in TDD was a revelation for me. I was firmly in the classicist camp of using mocks only when it was too hard to do it any other way, simply because every time I had seen someone go passed that they were doing it wrong, and ended up with a brittle, unmaintainabile mess. The big “A-HA!” moment for me was watching the Gary Bernhardt play by play vid on peepcode, where he uses them as a vehicle to drive design. That sparked reading a bunch of other things, now ending up firmly in the mockist BDD camp.

    1. +1 for the Gary Bernhardt Peepcode play by play. Watching that made me realize the awesome value of mocking to drive design

  3. Your section “Mock objects are for Design” reminds me of a rule of thumb I’ve read before (can’t find the source): Always mock Services, never mock Values, sometimes mock Entities

    That’s built on the DDD Services/Values/Entities concept, which I’ve found a very useful way of thinking about separation of concerns.

  4. Avdi, im having trouble understanding this “I decided to change up my spec setup a bit in order to pass in a template object which included the actual Rails tag helpers. Then I included the Capybara spec matchers for making assertions about HTML.” How can you use capybara rspec matchers when render_controls only returns a content_tag? Can you point me to a guide or an example please?

      1. My bad Avdi, sorry.. I wasn’t including the capybara rspec matchers. 

        I thought I was because the failure wasn’t undefined method `have_selector?’. It was undefined method ‘has_selector’ for #. I still don’t understand that actually.I created a small gist if someone runs into the same mistake as me https://gist.github.com/2356147. 

        Thanks for the awesome articles.

  5. “When I use mocks, but don’t employ a ‘tell, don’t ask’ style, it feels weird.” Yes. Mocks encourage a ‘tell, don’t ask’ style. This is a question of style, not of fitness or “smell”. 🙂

    I love your message, but with headings like “Mocks are a test smell” and “Mocks are a code smell” — misleading at best — you make it harder to teach this stuff.

  6. Nice article. I mostly agree. Except for the part where you mentioned that you’re a mockist. It sounds to me like you’re a classicist. They use mocks, only if needed, which sounds like what you’ve described.

Comments are closed.