Posts tagged ‘Code Kata’

Note: this article was originally published on ProAgile’s blog

I only recently joined ProAgile and I feel we’ve got off to a great start. Last week my colleague Fredrik Wendt and I facilitated and hosted a Code Retreat event for a diverse group of software crafters from local companies.

at the start of the day

It’s the 10th anniversary of the “Global Day of Code Retreat” event and there were over 150 locations around the globe holding one last week. It’s not the first time I’ve facilitated a Code Retreat but it is the first time I’ve done one in Gothenburg, (the others have been in Stockholm). This year Fredrik and I were keen to bring together some local people, and contribute to the community here. ProAgile has recently invested in a fantastic venue for courses and training at our offices, and we want to take full advantage of it.

We contacted several local companies and specifically invited them to advertise on the event sign up page that they were going to send people. We were delighted when Boeing (Jeppesen)PageroMeltwaterSpotify and Greenbyte all agreed to this. Their intention to participate I’m sure helped get the attention of other local software crafters when we put out a general invitation with open sign-ups.

In the end we were a very diverse group drawn from many local companies both large and small. We had a game designer who favoured C++, a Scala enthusiast, a Kotlin proponent, a Python trainer, several Javascript developers, a self-taught PHP and Python developer, a CEO who still codes occasionally, and many local Java, Python and C# developers. Everyone got to pair program with people from other language communities and backgrounds.

Fredrik and I decided to hold a standard code retreat event and based our planning on this article: “Structure of a Code Retreat”. We held 5 sessions working on the “Game of Life” Code Kata through the day. In each session, people work in pairs starting from an empty editor, and come up with a new solution to the problem. Between each session you hold a short retrospective with your pairing partner before taking a short break and switching to a different pairing partner.

pair programming

In the morning sessions the focus is on getting to know the problem. We all try to find a good way to use Test-Driven Development and simple design rules to get a clean, understandable, flexible design. The sessions after lunch are about using this Game of Life exercise as a vehicle for trying out something new or otherwise challenging. We had some pairs using unfamiliar programming languages and environments. Some pairs were following strict design rules like ‘no if statements, no loops’ and ‘no function longer than 1 line’. Some people accepted the challenge to work on the problem as a mob; we were 4-6 people using one computer with me facilitating.

mob programming

Throughout the day there were opportunities to reflect on the way we write code: how design guidelines and programming languages and Test-Driven Development and personal preferences all interact to influence the code we end up with. In normal project work with deadlines and tricky business rules and security constraints and so on, you might not get much time to consider the fundamentals of what good software design looks like. Test-Driven Development is a complex skill to learn – it has many facets. I think all programmers benefit from taking some time occasionally to work on practice exercises and discuss software design principles with peers.

At the end of the day we reflected all together in a closing circle. Everyone had the opportunity to say a few words to the group about what they were taking away with them. When you’ve spent time working on a Code Kata, I think there is a complexity gap you need to bridge – you want to carry over the principles and skills you gained from that simple exercise to your actual production code situation. It’s useful to think about what you’ve learnt and explain how you want to apply it later.

In the closing circle, I was happy to hear several positive comments about Test-Driven Development and an intention to use it more. Some people had seen a new programming language and had ideas about when it could be useful to them. Several people mentioned they appreciated the chance to program with people outside of their company and comfort zone. It was a good day, and I don’t think it will be the last such event we hold in Gothenburg.

If you’d like to know about the next events hosted by ProAgile, take a look at our course calendar, keep an eye on ProAgile’s social media channels and perhaps join one of our networking groups.

whole group at the end

Note: this article was first published on Praqma’s blog

Writing tests for ‘Theatrical Players’

When I read Fowler’s new ‘Refactoring’ book I felt sure the example from the first chapter would make a good Code Kata. However, he didn’t include the code for the test cases. I can fix that!

Martin Fowler recently published a new edition of his classic book ‘Refactoring’. The first chapter is a worked example of how he would go about refactoring a small piece of code that he needs to update with some new functionality.

This is a really common scenario faced by software developers daily. It’s impossible to anticipate every future change and enhancement, so the design of the code will often not be extensible in the direction you need to take it. Refactoring the design in order to make a proposed update easier is an essential skill.

I like using small exercises to practice these kinds of skills, called ‘Code Katas’. I’ve got a whole collection of them in my book ‘The Coding Dojo Handbook’, and more on my github page. When I saw Fowler’s example I felt it would make a really good kata. The particular refactoring techniques he demonstrates are techniques that many developers would benefit from learning and practicing.

I asked Fowler for permission to use his code, which he kindly granted, then I set about transforming this example into a kata.

Creating a new Code Kata

This is Fowler’s example javascript code, which produces a statement for a customer. The new functionality that’s needed is to produce the same information but formatted as HTML.

statement.js

function statement (invoice, plays) {
    let totalAmount = 0;
    let volumeCredits = 0;
    let result = `Statement for ${invoice.customer}\n`;
    const format = new Intl.NumberFormat("en-US",
        { style: "currency", currency: "USD",
            minimumFractionDigits: 2 }).format;

    for (let perf of invoice.performances) {
        const play = plays[perf.playID];
        let thisAmount = 0;
        switch (play.type) {
            case "tragedy":
                thisAmount = 40000;
                if (perf.audience > 30) {
                    thisAmount += 1000 * (perf.audience - 30);
                }
                break;
            case "comedy":
                thisAmount = 30000;
                if (perf.audience > 20) {
                    thisAmount += 10000 + 500 * (perf.audience - 20);
                }
                thisAmount += 300 * perf.audience;
                break;
            default:
                throw new Error(`unknown type: ${play.type}`);
        }
        // add volume credits
        volumeCredits += Math.max(perf.audience - 30, 0);
        // add extra credit for every ten comedy attendees
        if ("comedy" === play.type) volumeCredits += Math.floor(perf.audience / 5);
        // print line for this order
        result += ` ${play.name}: ${format(thisAmount/100)} (${perf.audience} seats)\n`;
        totalAmount += thisAmount;
    }
    result += `Amount owed is ${format(totalAmount/100)}\n`;
    result += `You earned ${volumeCredits} credits\n`;
    return result;
}

As you can see, the logic for formatting the statement is all mixed up with the calculation logic, so as it stands it’s not straightforward to add the HTML formatting feature. What’s needed is to change the design and detangle the logic, via refactoring.

Fowler mentions that the first step in refactoring is always the same – to ensure you have a solid set of tests for that section of code.

However, he did not include the test code he used in the book. (It’s a book primarily about refactoring, and I guess including test code would have taken attention away from that topic). To make this example into a Kata, I wanted to include some tests.

The testing approach I favour in this kind of situation is approval testing. The code exists already and it works, so the easiest way to add a regression test is to find some test data, exercise the code, and approve the result.

This is different from a test-driven development approach where you define the test case before the code exists. (You can use approval testing in that kind of scenario too, but that’s a topic for another article). I usually find it much quicker and easier to add an approval test to existing code than a classic assertion-based test case.

Creating a first approval test

Fowler supplies us with some test data – a sample invoice:

invoice.json
{
  "customer": "BigCo",
  "performances": [
    {
      "playID": "hamlet",
      "audience": 55
    },
    {
      "playID": "as-like",
      "audience": 35
    },
    {
      "playID": "othello",
      "audience": 40
    }
  ]
}

And a list of plays:

plays.json
{
  "hamlet": {"name": "Hamlet", "type": "tragedy"},
  "as-like": {"name": "As You Like It", "type": "comedy"},
  "othello": {"name": "Othello", "type": "tragedy"}
}

The function that we want to test, statement, returns the information to send to the customer, formatted as a plain text string. This statement string is perfect to use as the approved value in the approval test.

This is the test case I designed, it uses the Jest testing framework:

statement.test.js
test('example statement', () => {
   const invoice = JSON.parse(fs.readFileSync('test/invoice.json', 'utf8'));
   const plays = JSON.parse(fs.readFileSync('test/plays.json', 'utf8'));
   const statement_string = statement(invoice, plays);
   expect(statement_string).toMatchSnapshot();
});

First, I read both data files and parse them into javascript objects. Then I call the function we want to test – “statement” and store the result in the ‘statement_string’ constant. The last line of the test checks that this result matches the approved value. What it does is compare the actual value against a stored value which I checked and approved earlier.

This is kept in another file:

__snapshots__/statement.test.js.snap
exports[`example statement 1`] = `
"Statement for BigCo
Hamlet: $650.00 (55 seats)
As You Like It: $580.00 (35 seats)
Othello: $500.00 (40 seats)
Amount owed is $1,730.00
You earned 47 credits
"
`;

Is it called Snapshot or Approval testing?

The Jest testing framework calls the approved value a ‘snapshot’. I prefer to talk about ‘approval’ testing and comparing with an ‘approved’ value. I think these words get to the heart of what you’re doing – deciding what is good enough to test against.

‘Snapshot’ as a word just implies it will be updated again soon. I like to emphasize the human agency involved in inspecting and deciding whether to approve a new value or not.

Is this test good enough to support refactoring?

This test goes a long way to giving us the confidence we need to start refactoring the ‘statement’ code. It generates a full statement with test data that seems realistic enough for this context.

I think it’s a good idea to do a little more analysis of whether this test is sufficient to support the refactoring we want to do. One way to do that is to run the test and measure code coverage. Lines of code that are not covered could be refactored away, or have bugs inserted into them, without the tests failing.

The coverage data shows only one line of code which is not covered by this test:

Line 28 in statement.js:
throw new Error(`unknown type: ${play.type}`);

Our test case is pretty good, to only have one line uncovered. The test data supplied by Fowler includes different kinds of play that already exercise several paths through the code.

This code path is different, however. If you end up on this line then the function aborts, and doesn’t produce a statement. We can’t tweak the data in our existing code to make it also execute this line and still produce a statement.

What we need is an additional test, one that uses an unsupported play type. I invented some new data that continues the Theatrical theme:

new_plays.json
{
 "henry-v": {"name": "Henry V", "type": "history"},
 "as-like": {"name": "As You Like It", "type": "pastoral"}
}

invoice_new_plays.json
{
  "customer": "BigCoII",
  "performances": [
    {
      "playID": "henry-v",
      "audience": 53
    },
    {
      "playID": "as-like",
      "audience": 55
    }
  ]
}

This is the test case that uses them:

statement.test.js
test('statement with new play types', () => {
    const invoice = JSON.parse(fs.readFileSync('test/invoice_new_plays.json', 'utf8'));
    const plays = JSON.parse(fs.readFileSync('test/new_plays.json', 'utf8'));
    expect(() => {statement(invoice, plays)}).toThrow(/unknown type/);
});

This test is not an approval test, it’s a normal assertion-based test. It checks that the error is thrown and the message contains the string ‘unknown type’.

I could have instead used an approval test to check the error message. In this case I chose not to do that. The string to be checked is quite short, and there is little advantage in storing it in a separate file away from the test code.

Confidence in the tests

With that test added, we’re more confident we can use this test suite to support refactoring.

We have 100% coverage now, after all. Unfortunately, that is no guarantee that our tests are perfect, or even reasonably good. One thing I like to do is a spot of (manual) mutation testing. to assess how good my tests are.

I edit the code to insert a bug, then run my tests. If they fail, then that’s a good sign. I might try adding bugs in several different places to increase my confidence.

When I try it on this code, I find any change that affects the way the statement is presented gives me a failing test. I can also easily get a nice diff showing exactly what I broke.

tests-screenshot

I’m now feeling pretty confident these tests will be good enough to support refactoring.

This is a good starting point for the kata. From here you can practice all the refactorings demonstrated by Fowler in his book, with a safety net to alert you if you make a mistake.

When you come to implement the HTML rendering feature, it should be straightforward to add a new test for it too.

Adding more languages to your tests

In the book, the example language is Javascript, which is widely-used and understood. I often work with teams who also use languages like Java, C# and Python, and the refactoring techniques are just as relevant in those languages. Once I had this JavaScript version working, I added a translation or two. One of the reasons I keep my exercises publicly available on GitHub is so that people can send me pull requests.

If you’d like to do this exercise in a programming language that’s not supported yet, you can fix that for everyone 🙂 I particularly like getting pull requests with translations of my exercises.

The published code kata

The starting point for this new exercise ‘Theatrical-Players-Refactoring-Kata’ is now published and available for download.

The first chapter of ‘Refactoring’ containing a worked solution to this problem is available as a free sample. You have every opportunity to practice and learn these testing and refactoring techniques now.

Happy coding!

Last week I met Woody Zuill when he came to Göteborg to give a workshop about Mob Programming.  At first glance mobbing seems really innefficient. You have a whole team of maybe 6-7 people sitting together all day, every day, programming at one computer. How could that possibly be a productive way to work?

I’m pretty intrigued by the idea. It reminds me of the reaction people had to eXtreme Programming when they first heard about it back in like 2000. Is it just an off-putting name for something that could actually be quite brilliant? There are certainly some interesting people who I respect, talking warmly about it. The thing is, when it comes to working together with others, programming at one computer, I’ve had some mixed results. Sometimes good, sometimes less good.

I’ve done some pair programming, and found it worked well with some people and not others. It’s generally worked much better when I’ve paired with someone who has a lot of useful knowledge that I’ve lacked. Either about the language and frameworks we’re using, or about how the software will be used – ie the problem domain. I’ve found it’s worked a lot less well in other situations, with other people. I find it all too easy to hog the keyboard, basically. So I do pair, but not that often.

With ‘Randori’ style coding dojos, the idea is that you have a pair at the front who code, and switch one person every 5-7 minutes, or every test case. I’ve facilitated a lot of these sessions, and I find them especially useful for quickly getting a group of people new to TDD all up and running and pointed in the same direction. Recently I’ve been doing it only for the first session or two, instead having everyone working in pairs most of the time. As a facilitator, this is far easier to handle – much less stressful. Managing the interactions in a bigger group is difficult, both to keep the discussions on track, answer questions about the exercise, and to maintain the pair switching. I also find the person at the keyboard easily gets stressed and intimidated by having everyone watching them, and often writes worse code than they are capable of. So I do facilitate whole-group Randori sessions, but not that often.

So I wanted to find out if mob programming had similar strengths and weaknesses. In what situations does it excel, and when are you better off pairing or working alone? Would I find it stressful, like a Randori? Would I want to drive most of the time, as in pair programming?

Woody turns out to be a really gentle person, about as far away from a ‘hard sell’ as you can get. He facilitaed the session masterfully, mixing theory and practice, telling us stories about what he’s found to work and why. I am confident he knows a lot about software development in general, mob programming in particular, and he is very humble about it.

The most important insight I gained from the session, was that I need to get good at ‘strong-style pairing’. That seems to me to be at the heart of what makes Mob programming work, and not be stressful like the Randori sessions I’ve been doing. I think it will also help me to get pair programming to work well in a wider variety of situations.

I have heard about ‘strong-style pairing’ before, from Llewellyn Falco, who invented it, but I havn’t really experienced it very much before, or understood how important it is. Do go read his blog post about it, for a fuller explanation of what I’m talking about.

The basic idea is “For an idea to go from your head into the computer it MUST go through someone else’s hands”. That forces you to express your ideas really clearly, in words, first. That is actually pretty difficult when you havn’t done it much before. The thing is, that if you do that, then you open up your programming ideas for discussion, critique, and improvement, in a way that doesn’t happen if they go straight from your head through your own hands into the computer. I think if I get better at ‘strong-style pairing’ it will help me not only with Mob programming, but also with pairing and facilitating Randori dojo sessions. Probably also with programming generally.

Pairing has worked best for me when I’ve been driving, and my navigator is good at expressing ideas for me to understand and then type. I think I need to get good at that navigator role for the times when I’m the one with more ideas. I need to learn that when I think ‘I have an idea about to solve this problem!’ I should hand over the keyboard, not grab it. I need to learn to express my coding ideas verbally. Then I will be able to pair productively with a wider range of people.

Randori sessions are much less stressful if the driver has less to do. If the responsibility is shared more evenly with the Navigator, then I think everyone will write better code. As a facilitator, I have less group dynamics to worry about if the designated navigator is in control, and everyone else talks less. (Woody advised that, at least at first, you should ban anyone else in the mob from giving the Driver ideas about what to type, so the Navigator learns the role.)

So thanks, Woody, for taking the time to come to Göteborg, sharing your experiences and facilitating a great workshop. I learnt a lot, and I think Mob Programming and Strong-style pairing could quite possibly be some of those brilliant ideas that change the way I write code, for the better.

Recently I became intrigued with something Seb Rose said on his blog about ‘recycling’ tests. He talks about first producing a test for a ‘low fidelity’ version of the solution, and refining it as you learn better what the solution should look like. In a follow-up post he deals with some criticisms that other posters had of the technique, but actually seems to agree with Alistair Cockburn, that it’s probably not important enough a technique to need a name. I disagree, it’s a technique I use a lot, although most often when using an approval testing approach. I prefer to call it simply iterative development. A low fidelity version of the output that is gradually improved until the customer/product owner says “that’s what I want” is iterative development. It’s a very natural fit with approval testing – once the output is good enough to be approved, you check it in as a regression test that checks it never changes. It’s also a very natural fit for a problem where the solution is fundamentally visual, like printing a diamond. I also find it very helpful when the customer hasn’t exactly decided what they want. In this kata, it’s not such an issue, but in general, quickly putting out a low-fidelity version of what you think they want and then having a discussion about how to proceed can save you a lot of trouble.

The other posters seemed to be advocating a TDD approach where you find ‘universal truths’ about the problem and encode them in tests, so you never have to go back and revisit tests that you made pass earlier. In order to take small steps, you have to break down the problem into small pieces. Once you have identified a piece of the problem and solved it, it should stay solved as you carry on to the next piece. That seems to be what I would call ‘incremental’ development.

There’s a classic explaination of the difference between iterative and incremental that Jeff Patton came up with a few years ago using the Mona Lisa painting. It’s a good explaination, but I find experiencing abstract concepts like this in an actual coding problem can make a world of difference to how well you can reason about and apply them. So I thought it would be interesting to look at these two approaches to TDD using the Diamond Kata.

I have a regular coding dojo with my team these days, so a few weeks ago, I explained my thinking about incremental and iterative, showed them Jeff Patton’s picture, and asked them to do the kata one way or the other so we could compare. I probably didn’t explain it very well, because the discussion afterwards was quite inconclusive, and looking at their code, I didn’t think anyone had really managed to exclusively work one way or the other. So I decided to try to force them into it, by preparing the test cases in advance.

I came up with some starting code for the exercise, available here. I have two sets of unit tests, the first with a standard incremental approach, where you never delete any test cases. The second gets you to ‘recycle’ tests, and work more iteratively towards the final solution. In both cases, you are led through the problem in small steps. The first and last tests are the same, the difference is the route you take in between.

When I tried this exercise with my team, it went a lot better. I randomly assigned half the pairs to use the ‘iterative’ tests, and the rest to use ‘incremental’ tests. Then after about 45-55 minutes, I had them start over using the other tests. After another 45 minutes or so I stopped them and we had a group discussion comparing the approaches. I asked the ‘suggested questions for the retrospective‘ I’d prepared, and it seemed to work. Having test-driven the solution both ways, people could intelligently discuss the pros and cons of each approach, and reason about which situations might suit one or the other.

As Seb said, ‘recycling tests’ is a tool in your developer toolbox, and doing this kata might help you understand how to best use that tool. I’d love to hear from you if you try this excercise in your coding dojo, do leave a comment.

I was recently at the Software Craftsmanship Conference at Bletchley Park in the UK. This is a one-day conference for software developers, attended by around 150 programmers. All proceeds from the event go to support Bletchley Park, which is of historical interest to programmers in particular – the site where Alan Turing and others cracked the enigma code in the 2nd world war. It was the fifth time this conference has been run, and the first time I attended. This is a short experience report.

In the morning I ran a workshop titled “Outside-In, with or without Mocks?“. We were about 50 people in the Ballroom in the Mansion, a very grand room, and it was really great to see so many people working in pairs at laptops, puzzling over some code and tests and how to do Test Driven Development. We were looking at a code kata I’ve designed called “Train Reservation“. It’s in no way a beginner exercise, and the crowd at Bletchley seemed to get on with it rather well on the whole. I’m just sorry I didn’t get round to talk to each pair very often, with 24 pairs I only had a couple of conversations with each during the 2 hour session!

I set up the exercise more or less to force people to use some kind of mock, fake or stub to replace the Booking Reference Service and the Train Data Service, because I am interested in how different people use these. I’ve observed that some programmers avoid using test doubles whenever possible, while others use them frequently. I’ve also observed that some people prefer to work outside-in, starting with a guiding test, while others prefer to start with the business rules at the heart of the problem and work outwards from there. At this particular workshop, there were all sorts of approaches being used. Some started with the guiding test and stubbed the services. Others started with the business logic around the seat selection rules. Different approaches, as I had hoped! Overall I feel encouraged that this exercise is a useful one, and people seemed to get on better with it than the last time I ran it, at XP2013. It’s till rather too big of a problem to tackle in a half day workshop though. I’ll be updating it some more before I run it again, although I don’t have any fixed plans for when that will be yet.

In the afternoon, I went to a session by Ivan Moore and Mike Hill, “Inheritance to Composition“. They gave us a demo of this particular refactoring using a very simple codebase, before launching us into a much more complex one – Fitnesse (starting from the branch “revised-ResponderFactory”). The idea was to take some classes that were using Inheritance – specifically the Template Method pattern – and convert them to instead use Composition – specifically the Strategy pattern. They also helpfully provided us with a sheet of instructions – 6 steps to complete the refactoring with minimal risk and code breakage.

My pair and I got on fairly well with the refactoring, and by the end of the session we were on step 5 with the goal in sight. The experience was of using Eclipse’s refactoring tools extensively, and relying a great deal on the compiler. The tests we had to lean on took a minute and a half to run, and actually, the tests for the classes we were working on were more mini-integration tests than unit tests as such. It meant there were relatively few updates to the tests as we did the refactoring, but the feedback loop was slow. I thought that was really interesting, and was wondering how the experience of the refactoring would change in a language like Python. There you don’t have a compiler, or very much help from refactoring tools.

So after the workshop, I set about trying to construct a similar problem in Python. Perhaps understandably, I didn’t want to translate the whole of Fitnesse to Python, (!), so I tried to re-write only the elements of it essential to this exercise. You can have a look at what I’ve come up with in my new repo “WikiSearchKata“. I’m still working on preparing this properly as an exercise, (the instructions are still rather thin), but I plan to try it out at a GothPy meeting sometime soon.

After the conference sessions had ended, we were treated to a guided tour of the National Museum of Computing which was for me, the highlight of the day! Our enthusiastic guide showed us all sorts of ancient computers and storage devices and punch cards… a few I recognized from my childhood. My dad used to bring home old punch cards and my mum used to write her shopping lists on them when she went to the supermarket. They had a 48K ZX spectrum with rubber keys – just the same as the one I wrote my first program on! They had a CRAY supercomputer similar to the one I remember seeing once when I visited my dad’s work as a child. It’s a similar size to (the outside view of) a Tardis, with a big red button on the front. I don’t think we found out what the red button does, but the guide did say we probably have more computing power in the smartphone in our pocket! I found the changes in storage capacity actually even more impressive. They had these washing-machine sized boxes and dinner-plate sized metal disks that together made a hard drive. I think it held something like 4K.

The highlight of the tour was the WITCH computer – the oldest working computer in the world. It was brilliant! You could actually see what it was doing while it read in a paper tape punched with holes – the program – and loaded values into registries and did calculations. It made this fantastic whirring noise as it ran, and has all these little whizzy flashing lights. It works in decimal rather than binary, so each number is represented by a little “dekatron” – a glass tube with a red light inside, that moves between positions 0-9 in a circle. So you can read which number is in the registry by looking at the position of each light in the array. They also had this little button you could press to make it step through the program one instruction at a time. I got to press it, and single-step a computer from 1951!

Compared with other conferences I’ve been too, this one was rather short, just one day, and with rather long sessions – half or whole day. It was hard work coding and facilitating all day, but in general very interesting people and coding exercises. A second day would have made it more worthwhile my making the trip. In any case, my thanks to Jon Dickinson for organizing it.