Posts tagged ‘TDD’

I’ve been favouring an Approval Testing approach for many years now, since I find it pretty useful in many situations, particularly for acceptance tests. Not many people I meet know the term though, and even fewer know how to use the technique. Recently I’ve put together some small exercises – code katas – to help people to learn about it. I’ll be going through them at a couple of upcoming conference workshops, but for all you people who won’t be there in person, I’m publishing them on github as well.

I’ve got three katas set up now, Minesweeper, Yatzy and GildedRose. If you’ve done any of these katas before, you’ll probably have been using ordinary unit testing techniques. Hopefully by doing them again, with Approval Testing, you’ll learn a little about what’s different about this technique, and how it could be useful.

Before you can do the katas, you’ll need to install an approval testing tool. I’m one of the developers of TextTest, so that’s the tool I’ve set up right now. Below are some useful commands for a debian/ubuntu machine for installing it.

I’m still developing these exercises, and would like feedback about what you think of them. For example I have Python versions for all three, but only one has a Java version as yet. Do people want more translations? Do let me know how you get on, and what you think!

Installation instructions

You will need to have Python 2, and TextTest. (Unfortunately TextTest uses a GUI library that doesn’t support Python 3). For example:

$ sudo apt-get install python-pip
$ sudo pip install texttest

For more detailed instructions, and for other platforms see the texttest installation docs. For more general documentation, see the texttest website.

You need to have an editor and a diff tool configured for texttest to use. I recommend sublime text and meld. Install them like this:

$ sudo add-apt-repository ppa:webupd8team/sublime-text-3
$ sudo apt-get update
$ sudo apt-get install sublime-text-installer
$ sudo apt-get install meld

Then you need to configure texttest to use them:

$ cd
$ mkdir .texttest
$ touch .texttest/config
$ subl .texttest/config

Enter the following in that file, and save:


For convenience, I also like to create an alias ‘tt’ for starting TextTest for these exercises. Change directory to one of the exercise repositories, then a ‘tt’ command should start the TextTest GUI and show the tests for that exercise. Define such an alias like this:

alias tt='texttest -d python -c .'

Two of the exercises start with a small test suite for you to build on. There should be instructions in the README file of each respective exercise, to help you to get going. If you really can’t work out what to do, have a look at the sample solutions and see if that helps. These are also on github: Minesweeper-sample-solution, Yatzy-sample-solution, GildedRose-sample-solution

I’ve been interested for a while in the relationship between TDD and good design for a while, and the  SOLID principles of Object Oriented Design in particular. I’ve got this set of 4 “Racing Car” exercises that I originally got from Luca Minudel, that I’ve done in coding dojos with lots of different groups. If you’ve never done them, I do recommend getting your editor out and having a go, at least at the first one. I think you get a much better understanding of the SOLID principles when you both know the theory, and have experienced them in actual code.

I find it interesting that in the starting code for each of the four Katas there are design flaws that make it awkward to write unit tests for the code. You can directly point to violations of one or more of the SOLID principles. In particular for the Dependency Inversion Principle, it seems to me there is a very direct link with testability. If you have a fixed dependency to a concrete class, that is always going to be harder to isolate for a unit test, and the Tyre Pressure exercise shows this quite clearly.

What bothers me about the 4 original exercises is that there are actually 5 SOLID principles, and none of them really has a problem with the Liskov Substitution Principle. So I have designed a new exercise! It’s called “Leaderboard” and I’ve put it in the same git repository as the other four.

I tried it out last week in a coding dojo with my colleagues at Pagero, and it seemed to work pretty well. The idea is that the Liskov principle violation means you can’t propely test the Leaderboard class with test data that only uses the base class “Driver”, you have to add tests using a “SelfDrivingCar”. (Ok, I confess, I’ve taken some liberties with what’s likely in formula 1 racing!) Liskov says that your client code (ie Leaderboard) shouldn’t need to know if it has been given a base class or a subclass, they should be totally substitutable. So again, I’m finding a link between testability and good design.

Currently the exercise is only available in Scala, Python and Java, so I’m very open to pull requests for translations into other programming languages. Do add a comment here or on github if you try my new Kata.

Recently I became intrigued with something Seb Rose said on his blog about ‘recycling’ tests. He talks about first producing a test for a ‘low fidelity’ version of the solution, and refining it as you learn better what the solution should look like. In a follow-up post he deals with some criticisms that other posters had of the technique, but actually seems to agree with Alistair Cockburn, that it’s probably not important enough a technique to need a name. I disagree, it’s a technique I use a lot, although most often when using an approval testing approach. I prefer to call it simply iterative development. A low fidelity version of the output that is gradually improved until the customer/product owner says “that’s what I want” is iterative development. It’s a very natural fit with approval testing – once the output is good enough to be approved, you check it in as a regression test that checks it never changes. It’s also a very natural fit for a problem where the solution is fundamentally visual, like printing a diamond. I also find it very helpful when the customer hasn’t exactly decided what they want. In this kata, it’s not such an issue, but in general, quickly putting out a low-fidelity version of what you think they want and then having a discussion about how to proceed can save you a lot of trouble.

The other posters seemed to be advocating a TDD approach where you find ‘universal truths’ about the problem and encode them in tests, so you never have to go back and revisit tests that you made pass earlier. In order to take small steps, you have to break down the problem into small pieces. Once you have identified a piece of the problem and solved it, it should stay solved as you carry on to the next piece. That seems to be what I would call ‘incremental’ development.

There’s a classic explaination of the difference between iterative and incremental that Jeff Patton came up with a few years ago using the Mona Lisa painting. It’s a good explaination, but I find experiencing abstract concepts like this in an actual coding problem can make a world of difference to how well you can reason about and apply them. So I thought it would be interesting to look at these two approaches to TDD using the Diamond Kata.

I have a regular coding dojo with my team these days, so a few weeks ago, I explained my thinking about incremental and iterative, showed them Jeff Patton’s picture, and asked them to do the kata one way or the other so we could compare. I probably didn’t explain it very well, because the discussion afterwards was quite inconclusive, and looking at their code, I didn’t think anyone had really managed to exclusively work one way or the other. So I decided to try to force them into it, by preparing the test cases in advance.

I came up with some starting code for the exercise, available here. I have two sets of unit tests, the first with a standard incremental approach, where you never delete any test cases. The second gets you to ‘recycle’ tests, and work more iteratively towards the final solution. In both cases, you are led through the problem in small steps. The first and last tests are the same, the difference is the route you take in between.

When I tried this exercise with my team, it went a lot better. I randomly assigned half the pairs to use the ‘iterative’ tests, and the rest to use ‘incremental’ tests. Then after about 45-55 minutes, I had them start over using the other tests. After another 45 minutes or so I stopped them and we had a group discussion comparing the approaches. I asked the ‘suggested questions for the retrospective‘ I’d prepared, and it seemed to work. Having test-driven the solution both ways, people could intelligently discuss the pros and cons of each approach, and reason about which situations might suit one or the other.

As Seb said, ‘recycling tests’ is a tool in your developer toolbox, and doing this kata might help you understand how to best use that tool. I’d love to hear from you if you try this excercise in your coding dojo, do leave a comment.

I forget exactly when, but I think it was 2008 or 2009. Anyway, I was at a software conference, and I was chatting with a developer after one of the sessions about cool new technologies and stuff. I don’t remember what hot new thing it was we talked about, all I remember, is the shoes he was wearing!

credit: flickr, Steve Hodgson

This is rather an unusual style of running shoe. At the time, I’d never seen any like this before, and I was intrigued. It turns out that this developer I was talking to was, like me, also something of a serial early adopter, it’s just that he not only picked up shiny new programming tools and technologies.

At the time, I was running in a pair of shoes with thick heel padding the shop assistant had assured me would correct my bad posture and foot “pronation”. This guy’s “five finger” shoes had none of that, in fact quite the opposite. I was looking at disruptive running technology.

The conversation quickly switched from the latest programming tools and frameworks, as this guy explained the essential benefits of his shoes:

  • Lightweight
  • Your toes can spread out, giving better push-off from the ground
  • Thin sole – you adapt your stride to the surface because you can feel it
  • No heel padding, means you strike the ground with the whole foot simultaneously.

And of course, most importantly for a technology enthusiast:

  • People stare at you feet!

Following this conversation, a little googling about and watching the odd video by a “running style expert”, I became convinced. To be honest, it wasn’t much contest – shiny new technology, being in with the cool kids – I bought some new shoes, with toes and everything!

It took me several weeks to get used to them. You have to start with short distances, build up some new muscles in your foot, and learn to strike the ground with the whole foot at once. In my old shoes, I had a tendency to strike heel-first, because of the huge wad of padding on the sole, but in these minimal shoes, that just hurt. It was useful feedback, the whole-foot-at-once gait is supposed to be better for your knees.

After a few weeks of running shorter distances, slower than before, I gradually found my stride, and really started to enjoy running my usual 7km circuit of the local forest, in my eye-catching five-fingered shoes.
Unfortunately it didn’t last. Maybe two or three months later something happened. I think the technical term for it is Swedish Autumn. It turns out that forest tracks gain a surprising number of cold, muddy puddles at that time of year! Shoes with a very thin sole that isolate and surround each individual toe in waterlogged fabric, mean absolutely freezing feet 🙁

So I’m back on the internet, looking for new, shiny technology to fix this problem, and of course, I buy some new shoes. This time I got a pair of minimalist shoes in waterproof goretex, with basically all the features of my old shoes, minus the individual toes.

I was back out on the forest track, faster than ever, with dry, comfortable feet – win! The only problem was, people were no longer staring at my eye-catching toes. So you can’t have it all!

So this is normally a blog about programming. What’s going on?

Test Driven Development as a Disruptive Technology

I’ve been thinking about this, and it seems to me that as with running shoes that have toes, TDD is something of a disruptive technology. Just as I haven’t seen the majority of runners switch to shoes with toes, I also havn’t seen the majority of developers using TDD yet. Neither seem to have crossed Geoffrey Moore’s “chasm”.

Geoffrey Moore's technology adoption distribution showing the chasm

Lots of developers write unit tests, but I think that’s slightly different. I’m talking about a TDD where developers primarily use tests to inform and direct design decisions, and rely on them for minute-by-minute feedback as they work. In 2009, Kent Beck made an observation in his blog that “the data suggests that there are at most a few thousand Java programmers actively practicing TDD”. I don’t think the situation is radically different today.

So can we learn anything about TDD from the story about running shoes? A couple of points I find relevant:

  • Early adopters will try a new technology based on really very flimsy evidence, and will persevere with it, even if it slows them down in the short term.
  • Early adopters like to look cool and stick out.

You may think that last point is just vanity, but actually, being a talking point helps drive adoption, but primarily amongst other, similarly minded technology geeks.

I remember a while back I was at work, writing some code, when a guy from another team came over to ask me something. He was about to leave, when he did a double-take and stared at my screen for a moment. “Are you doing TDD? I’ve never seen anyone actually do that in production code. Do you mind if I watch?”. So, you see, eye-catching shiny new technology, and I’m one of the cool kids, about to be emulated by the guy in the next team. 🙂

The other part of this story, is of course the compromise I made when the cool technology met the reality of a muddy Swedish forest track. The toes went, but the shoe I ended up with is still radically different from the one I had before. I think that for TDD to reach the mainstream, it may need to become a little less extreme, a little more practical – but without losing the essential benefits.

What are the essential benefits of TDD? Well, I would say something like this:

  • Design: useful feedback, pushing you away from long methods and tightly-coupled classes, because they’re hard to test.
  • Refactoring: quickly detecting regression when you make a mistake
  • Productivity: helping you to manage complexity and work incrementally

So is it possible to get these things in another way? Without driving development minute-by-minute with tests? Well, that’s probably the subject of another blog post…

You might be interested to watch a video of my recent keynote speech at Europython, where I told this story.

I was recently at the Software Craftsmanship Conference at Bletchley Park in the UK. This is a one-day conference for software developers, attended by around 150 programmers. All proceeds from the event go to support Bletchley Park, which is of historical interest to programmers in particular – the site where Alan Turing and others cracked the enigma code in the 2nd world war. It was the fifth time this conference has been run, and the first time I attended. This is a short experience report.

In the morning I ran a workshop titled “Outside-In, with or without Mocks?“. We were about 50 people in the Ballroom in the Mansion, a very grand room, and it was really great to see so many people working in pairs at laptops, puzzling over some code and tests and how to do Test Driven Development. We were looking at a code kata I’ve designed called “Train Reservation“. It’s in no way a beginner exercise, and the crowd at Bletchley seemed to get on with it rather well on the whole. I’m just sorry I didn’t get round to talk to each pair very often, with 24 pairs I only had a couple of conversations with each during the 2 hour session!

I set up the exercise more or less to force people to use some kind of mock, fake or stub to replace the Booking Reference Service and the Train Data Service, because I am interested in how different people use these. I’ve observed that some programmers avoid using test doubles whenever possible, while others use them frequently. I’ve also observed that some people prefer to work outside-in, starting with a guiding test, while others prefer to start with the business rules at the heart of the problem and work outwards from there. At this particular workshop, there were all sorts of approaches being used. Some started with the guiding test and stubbed the services. Others started with the business logic around the seat selection rules. Different approaches, as I had hoped! Overall I feel encouraged that this exercise is a useful one, and people seemed to get on better with it than the last time I ran it, at XP2013. It’s till rather too big of a problem to tackle in a half day workshop though. I’ll be updating it some more before I run it again, although I don’t have any fixed plans for when that will be yet.

In the afternoon, I went to a session by Ivan Moore and Mike Hill, “Inheritance to Composition“. They gave us a demo of this particular refactoring using a very simple codebase, before launching us into a much more complex one – Fitnesse (starting from the branch “revised-ResponderFactory”). The idea was to take some classes that were using Inheritance – specifically the Template Method pattern – and convert them to instead use Composition – specifically the Strategy pattern. They also helpfully provided us with a sheet of instructions – 6 steps to complete the refactoring with minimal risk and code breakage.

My pair and I got on fairly well with the refactoring, and by the end of the session we were on step 5 with the goal in sight. The experience was of using Eclipse’s refactoring tools extensively, and relying a great deal on the compiler. The tests we had to lean on took a minute and a half to run, and actually, the tests for the classes we were working on were more mini-integration tests than unit tests as such. It meant there were relatively few updates to the tests as we did the refactoring, but the feedback loop was slow. I thought that was really interesting, and was wondering how the experience of the refactoring would change in a language like Python. There you don’t have a compiler, or very much help from refactoring tools.

So after the workshop, I set about trying to construct a similar problem in Python. Perhaps understandably, I didn’t want to translate the whole of Fitnesse to Python, (!), so I tried to re-write only the elements of it essential to this exercise. You can have a look at what I’ve come up with in my new repo “WikiSearchKata“. I’m still working on preparing this properly as an exercise, (the instructions are still rather thin), but I plan to try it out at a GothPy meeting sometime soon.

After the conference sessions had ended, we were treated to a guided tour of the National Museum of Computing which was for me, the highlight of the day! Our enthusiastic guide showed us all sorts of ancient computers and storage devices and punch cards… a few I recognized from my childhood. My dad used to bring home old punch cards and my mum used to write her shopping lists on them when she went to the supermarket. They had a 48K ZX spectrum with rubber keys – just the same as the one I wrote my first program on! They had a CRAY supercomputer similar to the one I remember seeing once when I visited my dad’s work as a child. It’s a similar size to (the outside view of) a Tardis, with a big red button on the front. I don’t think we found out what the red button does, but the guide did say we probably have more computing power in the smartphone in our pocket! I found the changes in storage capacity actually even more impressive. They had these washing-machine sized boxes and dinner-plate sized metal disks that together made a hard drive. I think it held something like 4K.

The highlight of the tour was the WITCH computer – the oldest working computer in the world. It was brilliant! You could actually see what it was doing while it read in a paper tape punched with holes – the program – and loaded values into registries and did calculations. It made this fantastic whirring noise as it ran, and has all these little whizzy flashing lights. It works in decimal rather than binary, so each number is represented by a little “dekatron” – a glass tube with a red light inside, that moves between positions 0-9 in a circle. So you can read which number is in the registry by looking at the position of each light in the array. They also had this little button you could press to make it step through the program one instruction at a time. I got to press it, and single-step a computer from 1951!

Compared with other conferences I’ve been too, this one was rather short, just one day, and with rather long sessions – half or whole day. It was hard work coding and facilitating all day, but in general very interesting people and coding exercises. A second day would have made it more worthwhile my making the trip. In any case, my thanks to Jon Dickinson for organizing it.