Here’s a question for you. What is the difference between a Spy and a Partial Mock? How about a Strict Mock? Test Double? Stub? What if I code it by hand instead of using a framework. Does that make it a Fake? Actually, I don’t think so! The language around mocks is an absolute minefield and it seems to me like every person means something different when they say Mock. I have seen this debate rumble on over the past 25 years and I have Opinions. We Need to Stop Calling Everything a Mock – Let me propose a better way to talk about these things.
This is a companion blog post to this video – with very similar content, for those who prefer to read rather than watch.
‘Mock’ really is the OG of language diffusion in software. The term ‘mock object’ first appeared in a paper published in 2000 and almost immediately it was misunderstood. People who were actually using Stubs started to call them Mocks. It went downhill very quickly after that.
Mocks aren’t Stubs – a debate underway even in 2003
I remember the XP conference in 2003 where the presenter on stage at a conference session, Charlie Poole, did exactly that. He was describing a class as a mock when it really wasn’t, and Tim Mackinnon, was in the audience. Just to explain – Charlie went on to become the main author of the popular testing framework NUnit, and Tim was one of the authors of the original paper about Mock Objects.
Charlie Poole was some way into his presentation when he started to talk about mock objects. Tim Mackinnon raised a hand and objected. ‘That’s not a Mock’, he said. Charlie, to his credit, handled this really graciously, he invited Tim to explain, and actually let him step up on to the stage to do so. Tim grabbed a marker and started sketching on the whiteboard on the stage, explaining why Charlie was wrong, and what a mock actually is. It was really interesting for us in the audience, but it took quite some time, and Charlie didn’t get to say most of what he’d planned.
I remember chatting with other delegates about this incident later in that conference and it seemed to me like Charlie was not alone – most of the XP community had misunderstood that Mocks aren’t the same as stubs.
Charlie was pretty miffed about missing half his talk, but at the same time open minded to learn and update his ideas. It’s become a fond memory for me of what conferences are all about – presenting new ideas and understanding different people’s perspectives, all with professional respect.
Meszaros attempt to fix it has failed
This confusion in the community generally wasn’t sorted out by this one intervention by Tim McKinnon. In 2007 Gerard Meszaros tried again. He published his book “xUnit Test Patterns” and introduced a new term – Test Double – his goal was to clear up the confusion. Mocks and Stubs are both types of Test Double that you use in different situations. Meszaros also explained the terms Spy and Fake – also kinds of test double. This effort to clarify the language was ambitious – and ultimately hasn’t worked.
I just read a book from 2019 “Effective Software Testing” by Mauricio Aniche where he tried to claim he was following Meszaros definitions, then came up with an entirely suspect definition of “Spy” – that it was backed up by a real object and let you ‘spy’ on interactions with the real one. My reading of Meszaros is quite different. Test Doubles do not have a real instance of the class behind them.
This confusion over Spies is fairly widespread, he’s not the only one saying this – I call Aniche out because he quotes Meszaros as a source, then apparently disagrees with him without explaining.
Another common confusion I have seen is with Fakes. If you write the class by hand rather than using a mocking framework, then it’s a Fake because then it has an implementation. No! Fakes do have an implementation – but actually so do instances you create with a mocking framework, you just don’t see the code in your project at compile time. It’s not what happens at compile time that is interesting though. The type of a test double is determined by how it behaves at runtime. You can hand-code mocks and spies as well as fakes.
Terminology matters
The authors of every new mocking framework seem to change what these terms mean or refer to one type as another of them. It is still an absolute terminology minefield. That does actually matter. On a practical level when I’m talking to my colleagues about our code, we need words we all understand to mean the same thing. As an industry as a whole – how can we make progress as a technical discipline unless we can agree definitions of things and stick to them over time? When someone publishes a book or an article with an authoritative definition of a term – we should take the time to read it before using that language.
Plus If I’m at a conference giving a presentation about, say, test design and there’s someone in the audience who literally wrote the original paper, I’d like it if they didn’t have to stand up and disagree with me in public. I try to avoid that.
Meszaros terms
Ok I feel like I have to go back and explain what Meszaros says these terms mean. I went back and read these chapters really carefully, and I’m doing my best to explain these terms without referencing any particular testing framework or programming language.
Stub
A Stub is an implementation of an interface used by the system under test. It lets us control indirect inputs by having them return specific hard coded values or exceptions. A Stub doesn’t check any particular arguments are passed or fail the test if some method is not called. A Stub is very stupid, and the simplest kind of test double.
Fake
A Fake is similar to a Stub – it also doesn’t check particular arguments or ever deliberately fail the test. It is a bit more sophisticated than a Stub though – it’s a lightweight implementation of the collaborator it replaces. It behaves the same as the real object, but doesn’t have all its attributes. It might not have good enough performance, stability, persistence and so on. The archetypal Fake is an in-memory database. You wouldn’t use it in production but it’s pretty useful to speed up your tests.
Spy
A Spy can do everything a Stub can do – control indirect inputs, but additionally lets you query the indirect outputs of the system under test. A Spy records all the method calls it gets so that during the ‘assert’ phase of the test you can check that all the right methods were called with the right arguments.
Mock
A Mock then, is not the same as a Stub, or a Fake or a Spy, it is a particular kind of test double. It’s most similar to a Spy, since you can also use it to check interactions.
For Meszaros the most significant difference between a Spy and a Mock was where a test failure is initiated. Otherwise they are used in essentially the same situations. Spies don’t directly fail a test – they record stuff and allow you to to check what happened in the assert part of the test. Mocks must be set up with expectations in the ‘arrange’ part and can initiate test failures on their own, in the case of strict mocks as soon as their expectations are not met in the ‘act’ part of the test.
That’s it. A Mock is basically a trigger-happy Spy.
Meszaros has Dummies as another kind of test double but they are largely uninteresting. He does not mention partial mocks backed by a real class, capture-replay mocks, self-initializing fakes, approval mocks or auto generated contract stubs.
Test frameworks today
It’s clear to me that things have only got worse since Meszaros attempted to clean up the terminology. There are a profusion of mocking frameworks using all these terms and more in ways fairly inconceivable when he wrote this book. What’s happened in practice is that ‘Mock’ means any kind of test double. I don’t think it’s usually worth correcting people and saying ‘Mocks aren’t Stubs’ these days. We lost that battle.
My suggestion
What is interesting to me about all these kinds of test double is how you use them to help you write better tests and to design the system under test. What role does this object play? That is more interesting than the way I created it or what my mocking framework calls it.
My suggestion for how fix the semantic diffusion. Call everything a Mock, or a Fake or Test Double, I don’t really mind. And, back that up with a verb for how you’re using it.
- I’m using this mock to stub these values.
- I’m using that mock to spy on this important interaction.
- It’s a mock I generated from this contract which I’m using to stub these interactions
- it’s a mock that is helping me to design this interaction with a collaborator.
Could that language catch on perhaps?
Mocks for design
I might have surprised you, that one of the verbs for describing a role a Mock can play is design – it’s one of the things that really go lost in this whole semantic drift for the word Mock.
The original Mock Objects paper was clear – the goal was better object oriented design. Not particularly easier test design. The authors were using the OO design principle ‘tell, don’t ask’ so they wanted their objects to encapsulate their data and the operations on them. That meant no extra ‘get’ methods added for the tests so they couldn’t easily check the object state. Instead, used a mock to check the indirect outputs – the exit points where it interacts with its collaborators. Mocks are originally a design tool.
Conclusions
Ok, as I said at the start, We Need to Stop Calling Everything a Mock. Although, I’m not going to encourage you to use Meszaros language strictly. Noone else does! It’s good to know what ‘test double’ means but unfortunately, that solution didn’t work. I want you to start talking about what these objects do for you instead of what they are called. Use verbs. Explain the value you get from this object – does it let you stub something, or record something or expect something or design something?
Happy Coding!



