Posts tagged ‘refactoring’

Note: this article was first published on Praqma’s blog

Writing tests for ‘Theatrical Players’

When I read Fowler’s new ‘Refactoring’ book I felt sure the example from the first chapter would make a good Code Kata. However, he didn’t include the code for the test cases. I can fix that!

Martin Fowler recently published a new edition of his classic book ‘Refactoring’. The first chapter is a worked example of how he would go about refactoring a small piece of code that he needs to update with some new functionality.

This is a really common scenario faced by software developers daily. It’s impossible to anticipate every future change and enhancement, so the design of the code will often not be extensible in the direction you need to take it. Refactoring the design in order to make a proposed update easier is an essential skill.

I like using small exercises to practice these kinds of skills, called ‘Code Katas’. I’ve got a whole collection of them in my book ‘The Coding Dojo Handbook’, and more on my github page. When I saw Fowler’s example I felt it would make a really good kata. The particular refactoring techniques he demonstrates are techniques that many developers would benefit from learning and practicing.

I asked Fowler for permission to use his code, which he kindly granted, then I set about transforming this example into a kata.

Creating a new Code Kata

This is Fowler’s example javascript code, which produces a statement for a customer. The new functionality that’s needed is to produce the same information but formatted as HTML.

statement.js

function statement (invoice, plays) {
    let totalAmount = 0;
    let volumeCredits = 0;
    let result = `Statement for ${invoice.customer}\n`;
    const format = new Intl.NumberFormat("en-US",
        { style: "currency", currency: "USD",
            minimumFractionDigits: 2 }).format;

    for (let perf of invoice.performances) {
        const play = plays[perf.playID];
        let thisAmount = 0;
        switch (play.type) {
            case "tragedy":
                thisAmount = 40000;
                if (perf.audience > 30) {
                    thisAmount += 1000 * (perf.audience - 30);
                }
                break;
            case "comedy":
                thisAmount = 30000;
                if (perf.audience > 20) {
                    thisAmount += 10000 + 500 * (perf.audience - 20);
                }
                thisAmount += 300 * perf.audience;
                break;
            default:
                throw new Error(`unknown type: ${play.type}`);
        }
        // add volume credits
        volumeCredits += Math.max(perf.audience - 30, 0);
        // add extra credit for every ten comedy attendees
        if ("comedy" === play.type) volumeCredits += Math.floor(perf.audience / 5);
        // print line for this order
        result += ` ${play.name}: ${format(thisAmount/100)} (${perf.audience} seats)\n`;
        totalAmount += thisAmount;
    }
    result += `Amount owed is ${format(totalAmount/100)}\n`;
    result += `You earned ${volumeCredits} credits\n`;
    return result;
}

As you can see, the logic for formatting the statement is all mixed up with the calculation logic, so as it stands it’s not straightforward to add the HTML formatting feature. What’s needed is to change the design and detangle the logic, via refactoring.

Fowler mentions that the first step in refactoring is always the same – to ensure you have a solid set of tests for that section of code.

However, he did not include the test code he used in the book. (It’s a book primarily about refactoring, and I guess including test code would have taken attention away from that topic). To make this example into a Kata, I wanted to include some tests.

The testing approach I favour in this kind of situation is approval testing. The code exists already and it works, so the easiest way to add a regression test is to find some test data, exercise the code, and approve the result.

This is different from a test-driven development approach where you define the test case before the code exists. (You can use approval testing in that kind of scenario too, but that’s a topic for another article). I usually find it much quicker and easier to add an approval test to existing code than a classic assertion-based test case.

Creating a first approval test

Fowler supplies us with some test data – a sample invoice:

invoice.json
{
  "customer": "BigCo",
  "performances": [
    {
      "playID": "hamlet",
      "audience": 55
    },
    {
      "playID": "as-like",
      "audience": 35
    },
    {
      "playID": "othello",
      "audience": 40
    }
  ]
}

And a list of plays:

plays.json
{
  "hamlet": {"name": "Hamlet", "type": "tragedy"},
  "as-like": {"name": "As You Like It", "type": "comedy"},
  "othello": {"name": "Othello", "type": "tragedy"}
}

The function that we want to test, statement, returns the information to send to the customer, formatted as a plain text string. This statement string is perfect to use as the approved value in the approval test.

This is the test case I designed, it uses the Jest testing framework:

statement.test.js
test('example statement', () => {
   const invoice = JSON.parse(fs.readFileSync('test/invoice.json', 'utf8'));
   const plays = JSON.parse(fs.readFileSync('test/plays.json', 'utf8'));
   const statement_string = statement(invoice, plays);
   expect(statement_string).toMatchSnapshot();
});

First, I read both data files and parse them into javascript objects. Then I call the function we want to test – “statement” and store the result in the ‘statement_string’ constant. The last line of the test checks that this result matches the approved value. What it does is compare the actual value against a stored value which I checked and approved earlier.

This is kept in another file:

__snapshots__/statement.test.js.snap
exports[`example statement 1`] = `
"Statement for BigCo
Hamlet: $650.00 (55 seats)
As You Like It: $580.00 (35 seats)
Othello: $500.00 (40 seats)
Amount owed is $1,730.00
You earned 47 credits
"
`;

Is it called Snapshot or Approval testing?

The Jest testing framework calls the approved value a ‘snapshot’. I prefer to talk about ‘approval’ testing and comparing with an ‘approved’ value. I think these words get to the heart of what you’re doing – deciding what is good enough to test against.

‘Snapshot’ as a word just implies it will be updated again soon. I like to emphasize the human agency involved in inspecting and deciding whether to approve a new value or not.

Is this test good enough to support refactoring?

This test goes a long way to giving us the confidence we need to start refactoring the ‘statement’ code. It generates a full statement with test data that seems realistic enough for this context.

I think it’s a good idea to do a little more analysis of whether this test is sufficient to support the refactoring we want to do. One way to do that is to run the test and measure code coverage. Lines of code that are not covered could be refactored away, or have bugs inserted into them, without the tests failing.

The coverage data shows only one line of code which is not covered by this test:

Line 28 in statement.js:
throw new Error(`unknown type: ${play.type}`);

Our test case is pretty good, to only have one line uncovered. The test data supplied by Fowler includes different kinds of play that already exercise several paths through the code.

This code path is different, however. If you end up on this line then the function aborts, and doesn’t produce a statement. We can’t tweak the data in our existing code to make it also execute this line and still produce a statement.

What we need is an additional test, one that uses an unsupported play type. I invented some new data that continues the Theatrical theme:

new_plays.json
{
 "henry-v": {"name": "Henry V", "type": "history"},
 "as-like": {"name": "As You Like It", "type": "pastoral"}
}

invoice_new_plays.json
{
  "customer": "BigCoII",
  "performances": [
    {
      "playID": "henry-v",
      "audience": 53
    },
    {
      "playID": "as-like",
      "audience": 55
    }
  ]
}

This is the test case that uses them:

statement.test.js
test('statement with new play types', () => {
    const invoice = JSON.parse(fs.readFileSync('test/invoice_new_plays.json', 'utf8'));
    const plays = JSON.parse(fs.readFileSync('test/new_plays.json', 'utf8'));
    expect(() => {statement(invoice, plays)}).toThrow(/unknown type/);
});

This test is not an approval test, it’s a normal assertion-based test. It checks that the error is thrown and the message contains the string ‘unknown type’.

I could have instead used an approval test to check the error message. In this case I chose not to do that. The string to be checked is quite short, and there is little advantage in storing it in a separate file away from the test code.

Confidence in the tests

With that test added, we’re more confident we can use this test suite to support refactoring.

We have 100% coverage now, after all. Unfortunately, that is no guarantee that our tests are perfect, or even reasonably good. One thing I like to do is a spot of (manual) mutation testing. to assess how good my tests are.

I edit the code to insert a bug, then run my tests. If they fail, then that’s a good sign. I might try adding bugs in several different places to increase my confidence.

When I try it on this code, I find any change that affects the way the statement is presented gives me a failing test. I can also easily get a nice diff showing exactly what I broke.

tests-screenshot

I’m now feeling pretty confident these tests will be good enough to support refactoring.

This is a good starting point for the kata. From here you can practice all the refactorings demonstrated by Fowler in his book, with a safety net to alert you if you make a mistake.

When you come to implement the HTML rendering feature, it should be straightforward to add a new test for it too.

Adding more languages to your tests

In the book, the example language is Javascript, which is widely-used and understood. I often work with teams who also use languages like Java, C# and Python, and the refactoring techniques are just as relevant in those languages. Once I had this JavaScript version working, I added a translation or two. One of the reasons I keep my exercises publicly available on GitHub is so that people can send me pull requests.

If you’d like to do this exercise in a programming language that’s not supported yet, you can fix that for everyone 🙂 I particularly like getting pull requests with translations of my exercises.

The published code kata

The starting point for this new exercise ‘Theatrical-Players-Refactoring-Kata’ is now published and available for download.

The first chapter of ‘Refactoring’ containing a worked solution to this problem is available as a free sample. You have every opportunity to practice and learn these testing and refactoring techniques now.

Happy coding!

I was recently at the Software Craftsmanship Conference at Bletchley Park in the UK. This is a one-day conference for software developers, attended by around 150 programmers. All proceeds from the event go to support Bletchley Park, which is of historical interest to programmers in particular – the site where Alan Turing and others cracked the enigma code in the 2nd world war. It was the fifth time this conference has been run, and the first time I attended. This is a short experience report.

In the morning I ran a workshop titled “Outside-In, with or without Mocks?“. We were about 50 people in the Ballroom in the Mansion, a very grand room, and it was really great to see so many people working in pairs at laptops, puzzling over some code and tests and how to do Test Driven Development. We were looking at a code kata I’ve designed called “Train Reservation“. It’s in no way a beginner exercise, and the crowd at Bletchley seemed to get on with it rather well on the whole. I’m just sorry I didn’t get round to talk to each pair very often, with 24 pairs I only had a couple of conversations with each during the 2 hour session!

I set up the exercise more or less to force people to use some kind of mock, fake or stub to replace the Booking Reference Service and the Train Data Service, because I am interested in how different people use these. I’ve observed that some programmers avoid using test doubles whenever possible, while others use them frequently. I’ve also observed that some people prefer to work outside-in, starting with a guiding test, while others prefer to start with the business rules at the heart of the problem and work outwards from there. At this particular workshop, there were all sorts of approaches being used. Some started with the guiding test and stubbed the services. Others started with the business logic around the seat selection rules. Different approaches, as I had hoped! Overall I feel encouraged that this exercise is a useful one, and people seemed to get on better with it than the last time I ran it, at XP2013. It’s till rather too big of a problem to tackle in a half day workshop though. I’ll be updating it some more before I run it again, although I don’t have any fixed plans for when that will be yet.

In the afternoon, I went to a session by Ivan Moore and Mike Hill, “Inheritance to Composition“. They gave us a demo of this particular refactoring using a very simple codebase, before launching us into a much more complex one – Fitnesse (starting from the branch “revised-ResponderFactory”). The idea was to take some classes that were using Inheritance – specifically the Template Method pattern – and convert them to instead use Composition – specifically the Strategy pattern. They also helpfully provided us with a sheet of instructions – 6 steps to complete the refactoring with minimal risk and code breakage.

My pair and I got on fairly well with the refactoring, and by the end of the session we were on step 5 with the goal in sight. The experience was of using Eclipse’s refactoring tools extensively, and relying a great deal on the compiler. The tests we had to lean on took a minute and a half to run, and actually, the tests for the classes we were working on were more mini-integration tests than unit tests as such. It meant there were relatively few updates to the tests as we did the refactoring, but the feedback loop was slow. I thought that was really interesting, and was wondering how the experience of the refactoring would change in a language like Python. There you don’t have a compiler, or very much help from refactoring tools.

So after the workshop, I set about trying to construct a similar problem in Python. Perhaps understandably, I didn’t want to translate the whole of Fitnesse to Python, (!), so I tried to re-write only the elements of it essential to this exercise. You can have a look at what I’ve come up with in my new repo “WikiSearchKata“. I’m still working on preparing this properly as an exercise, (the instructions are still rather thin), but I plan to try it out at a GothPy meeting sometime soon.

After the conference sessions had ended, we were treated to a guided tour of the National Museum of Computing which was for me, the highlight of the day! Our enthusiastic guide showed us all sorts of ancient computers and storage devices and punch cards… a few I recognized from my childhood. My dad used to bring home old punch cards and my mum used to write her shopping lists on them when she went to the supermarket. They had a 48K ZX spectrum with rubber keys – just the same as the one I wrote my first program on! They had a CRAY supercomputer similar to the one I remember seeing once when I visited my dad’s work as a child. It’s a similar size to (the outside view of) a Tardis, with a big red button on the front. I don’t think we found out what the red button does, but the guide did say we probably have more computing power in the smartphone in our pocket! I found the changes in storage capacity actually even more impressive. They had these washing-machine sized boxes and dinner-plate sized metal disks that together made a hard drive. I think it held something like 4K.

The highlight of the tour was the WITCH computer – the oldest working computer in the world. It was brilliant! You could actually see what it was doing while it read in a paper tape punched with holes – the program – and loaded values into registries and did calculations. It made this fantastic whirring noise as it ran, and has all these little whizzy flashing lights. It works in decimal rather than binary, so each number is represented by a little “dekatron” – a glass tube with a red light inside, that moves between positions 0-9 in a circle. So you can read which number is in the registry by looking at the position of each light in the array. They also had this little button you could press to make it step through the program one instruction at a time. I got to press it, and single-step a computer from 1951!

Compared with other conferences I’ve been too, this one was rather short, just one day, and with rather long sessions – half or whole day. It was hard work coding and facilitating all day, but in general very interesting people and coding exercises. A second day would have made it more worthwhile my making the trip. In any case, my thanks to Jon Dickinson for organizing it.

I’ve been working on this Kata “Gilded Rose” at a few different coding dojos lately. There is even a video of a session I did at the “Tampere Goes Agile” conference recently. In the video, you can see me talking about my Principles of Agile Test Automation, which I have just written about, and updated in my last blog post.

I think these test automation principles are useful to think about when you’re doing the Gilded Rose kata. The basic plot of the Kata is that you’ve just been hired to look after an existing system, and the customer wants a new feature. Having a look at the code, you can see you’re going to want to refactor it a little before adding the new feature, and before you do that, you’re going to want some automated tests.

So the first part of the Kata is to add automated tests to the existing code. You’ve got a requirements document the customer has given you, and you can use it to identify test cases. You’ve also got the code which you can read and execute and work out what it does. The customer is happily using the code in production right now, so you can assume that the behaviour it has is the behaviour they want to keep, whatever it says in the requirements document. (hint!)

Warning – spoilers lie ahead! You should probably try the Gilded Rose kata for yourself before reading on!

When I’ve done this exercise with various groups, I’ve spent a lot of time discussing with people how to make their test cases really readable, and express the requirements clearly, and at the same time useful as regression protection when refactoring the code later.

When you design a test suite you have two main aims – to help you understand what the code should do, (and what it does now), and protection from regression failures when you update it. It can be a bit tricky to do both with the same test suite. If you focus solely on describing the requirements in an executable way, you tend to miss edge cases and there are gaps in the regression protection. If you focus only on regression protection, you’ll spend time analysing the edge cases, and measuring code coverage to see how well you’re doing, but the test cases can become quite hard to read and understand.

You can see for yourself by comparing this test case by Bobby Johnson with this text-based approval test. (It was written by several people at a GothPy meeting). Bobby’s test case is extremely readable and expresses the requirements clearly. He’s done pretty well on the edge cases, but I think he’s missing one or two*. With the text-based approval tests, it’s not so easy to understand what the underlying business rules are, although the regression protection is very good.

When I do this kata with a group, we spend some time discussing the various test cases we’ve come up with, and showing them on the projector. When we did this last week at the Booster Conference, people commented that showing these different test cases had given them a better understanding of “readability” and “regression protection”, and many went on to improve their test suites.

Once you’re reasonably happy with your test suite, the next task is to do the refactoring and add the new feature. How useful are your test cases for regression protection? It’s very easy to make refactoring mistakes in this kata, and you will be testing your tests! You may discover while refactoring that there are more test cases that you want to add. Version control can be pretty useful, so you can run the new test cases against the original code.

There’s also an interesting restriction on your refactoring options – the “Item” class is owned by a nasty-sounding goblin and he doesn’t want you to change his code, so if you do, you have to be prepared for some serious consequences! When comparing refactored solutions at the end of the dojo, this is often an interesting discussion point – did you change the Item class? Is your new design so great that you’re prepared to argue with the goblin for it?!

I havn’t tried this, but I would actually like to try running the text-based approval test against all the refactored solutions at the end of the coding dojo, as input to the retrospective. I think this test covers all the edge cases very well, and would reveal any refactoring mistakes that were not caught by the tests people had developed themselves. That would be interesting feedback to have!

If you havn’t tried the Gilded Rose kata yourself, I do recommend it for practicing writing good test cases. I’d be happy to get a pull request from you if you want to translate the exercise into your favourite programming language, or you can do it in the original C#, as Bobby suggests.

If you’re interested in taking part in a coding dojo with me, I’ll be at several conferences later this year: ACCU in Bristol, XP2013 in Vienna and Test Automation Day in the Netherlands.

* I believe he’s missing a check that the quality of backstage passes doesn’t increase past 50

The Cyber-Dojo tool was designed by Jon Jagger as en environment where you can practice your coding skills. I’ve used it a few times now with groups at coding dojos and code retreats, and I think it’s a pretty useful tool for those contexts. (See also my last post which talks about using Cyber-Dojo during Global Day of Code Retreat).

One of the advantages of Cyber-Dojo for a Coding Dojo, (or Code Retreat), is that you don’t waste much time at the start of a coding session setting up a coding environment. The session facilitator creates a Cyber-Dojo instance in advance, and puts the practice-id up on a whiteboard or projector where everyone can see it. Participants just point their browsers at cyber-dojo.com, enter the practice-id, and very quickly get coding.

Cyber-Dojo supports about a dozen programming languages, and has starting positions set up for about 30 code katas. What is less known, is that it also allows you to set up any kata or starting position you like. I thought I’d take this opportunity to create some documentation for this feature:

  1. create a new cyber-dojo instance by going to http://cyber-dojo.com/ and pressing “setup”
  2. Select the programming language you want to use
  3. Select “Verbal” from the list of katas
  4. Click “OK”, then make a note of the “practice-id” – it’s also in the url. Press “Start” to enter this cyber-dojo instance.
  5. Edit the code files with your starting position, and update the instructions with the details of the kata exercise. Basically get the cyber-dojo instance set up to the position you want people in your Coding Dojo to start from. Run the tests as often as you like until you have everyting as you want it.
  6. Click on the “fork” icon on the left hand side to create a new cyber-dojo instance starting from this position, and note the new practice-id. You can give this id to your Coding Dojo participants.
  7. You can also publish a url that will automatically create a new cyber-dojo instance from this position, so people can create their own cyber-dojo instances. The form of the url is:
http://cyber-dojo.com/forker/fork/(practice-id)?avatar=(your animal)&tag=(number of the traffic light to fork)

I’ve used this feature to set up a number of Refactoring katas in cyber-dojo, for example the Tennis kata:

Perhaps sometime Jon will add a page on cyber-dojo.com that lists these kinds of additional available starting posistions, (hint!), but for now, you’ll have to keep track of them yourself.

By the way, do let me know if you try out this Tennis Refactoring Kata in Cyber-Dojo and how you get on with it. I welcome comments on this blog or on my github repo.