I’ve previously written about Agile test automation principles, and since then I’ve had some interesting discussions with people that have led me to revise them in this article. In particular, Seb Rose wrote about his 6 principles of unit testing and pointed out some issues with mine. So this article is an update on the previous one, and I’m hoping this will spark further interesting discussions!

I feel like I’ve spent most of my career learning how to write good automated tests in an agile environment. When I downloaded JUnit in the year 2000 it didn’t take long before I was hooked – unit tests for everything in sight. That gratifying green bar is near-instant feedback that everthing is as expected, my code does what I intended, and I can continue developing from a firm foundation.

Later, starting in about 2002, I began writing larger granularity tests, for whole subsystems; functional tests if you like. The feedback that my code does what I intended, and that it has working functionality has given me confidence time and again to release updated versions to end-users.

I was not the first to discover that developers design automated functional tests for two main purposes. Initially we design them to help clarify our understanding of what to build. In fact, at that point they’re not really tests, we usually call them scenarios, or examples. Later, the main purpose of the tests becomes to detect regression errors, although we continue use them to document what the system does.

When you’re designing a functional test suite, you’re trying to support both aims, and sometimes you have to make tradeoffs between them. You’re also trying to keep the cost of writing and maintaining the tests as low as possible, and as with most software, it’s the maintenance cost that dominates. Over the years I’ve begun to think in terms of four principles that help me to design functional test suites that make good tradeoffs and identify when a particular test case is fit for purpose.

Book-128Readability

When you look at the test case, you can read it through and understand what the test is for. You can see what the expected behaviour is, and what aspects of it are covered by the test. When the test fails, you can quickly see what is broken.

If your test case is not readable, it will not be useful, neither for understanding what the system does, or identifying regression errors. When it fails you will have to dig though other sources outside of the test case to find out what is wrong. Quite likely you will not understand what is wrong and you will rewrite the test to check for something else, or simply delete it.

internet_128Robustness

When a test fails, it means there is a regression error, (functionality is broken), or the system has changed and the tests no longer document the correct behaviour. You need to take action to correct the system or update the test, and this is as it should be. If however, the test has failed for no good reason, you have a problem: a fragile test.

There are many causes of fragile tests. For example tests that are not isolated from one another, duplication between test cases, and dependencies on random or threaded code. If you run a test by itself and it passes, but fails in a suite together with other tests, then you have an isolation problem. If you have one broken feature and it causes a large number of test failures, you have duplication between test cases. If you have a test that fails in one test run, then passes in the next when nothing changed, you have a flickering test.

If your tests often fail for no good reason, you will start to ignore them. Quite likely there will be real failures hiding amongst all the false ones, and the danger is you will not see them.

Speed-128Speed

As an agile developer you run your test suite frequently. Both (a) every time you build the system, (b) before you check in changes, and (c) after check-in in an automated Continuous Integration environment. I recommend time limits of 2 minutes for (a), 10 minutes for (b), and 60 minutes for (c). This fast feedback gives you the best chance of actually being willing to run the tests, and to find defects when they’re cheapest to fix, soon after insertion.

If your test suite is slow, it will not be used. When you’re feeling stressed, you’ll skip running them, and problem code will enter the system. In the worst case the test suite will never become green. You’ll fix the one or two problems in a given run and kick off a new test run, but in the meantime you’ll continue developing and making other changes. The diagnose-and-fix loop gets longer and the tests become less likely to ever all pass at the same time. This can become pretty demoralizing.

updatability.001Updatability

When the needs of the users change, and the system is updated, your tests also need to be updated in tandem. It should be straightforward to identify which tests are affected by a given change, and quick to update them all.

If your tests are not easy to update, they will likely get left behind as the system moves on. Faced with a small change that causes thousands of failures and hours of work to update them all, you’ll likely delete most of the tests.

Following these four principles implies Maintainability

Taken all together, I think how well your tests adhere to these principles will determine how maintainable they are, or in other words, how much they will cost. That cost needs to be in proportion to the benefits you get: helping you understand what the system does, and regression protection.

As your test suite grows, it becomes ever more challenging to adhere to all the principles. Readability suffers when there are so many test cases you can’t see the forest for the trees. The more details of your system that you cover with tests, the more likely you are to have Robustness problems – tests that fail when these details change.  Speed obviously also suffers – the time to run the test suite usually scales linearly with the number of test cases. Updatability doesn’t necessarily get worse as the number of test cases increases, but it will if you don’t adhere to good design principles in your test code, or lack tools for bulk update of test data for example.

I think the principles are largely the same whether you’re writing skinny little unit tests or fatter functional tests that touch more of the codebase. My experience tells me that it’s a lot easier to be successful with unit tests. As the testing thickness increases, the feedback cycle gets slower, and your mistakes are amplified. That’s why I concentrate on teaching these principles through unit testing exercises. Once you understand what you’re aiming for, you can transfer your skills to functional tests.

How can you use these principles?

I find it useful to remember these principles when designing test cases. I may need to make tradeoffs between them, and it helps just to step back and assess how I’m doing on each principle from time to time as I develop. If I’m reviewing someone else’s test cases, I can point to code and say which principles it’s not following, and give them concrete advice about how to make improvements. We can have a discussion for example about whether to add more test cases in order to improve regression protection, and how to do that without reducing overall readability.

I also find these principles useful when I’m trying to diagnose why a test suite is not being useful to a development team, especially if things have got so bad they have stopped maintaining it. I can often identify which principle(s) the team has missed, and advise how to refactor the test suite to compensate.

For example, if the problem is lack of Speed you have some options and tradeoffs to make:

  • Replace some of the thicker, slower end-to-end tests with lots of skinny fast unit tests, (may reduce regression protection)
  • Invest in hardware and run tests in parallel (costs $)
  • Use a profiler to optimize the tests for speed the same as you would production code (may affect Readability)
  • Use more fakes to replace slow parts of the system (may reduce regression protection)
  • Identify key test cases for essential functionality and remove the other test cases. (sacrifice regression protection to get Speed)

Strategic Decisions

The principles also help me when I’m discussing automated testing strategy, and choosing testing tools. Some tools have better support for updating test cases and test data. Some allow very Readable test cases. It’s worth noting that automated tests in agile are quite different from in a traditional process, since they are run continually throughout the process, not just at the end. I’ve found many traditional automation tools don’t lead to enough Speed and Robustness to support agile development.

I hope you will find these principles help you to reason about your strategy and tools for functional automated testing, and to design more maintainable, useful test cases.

Images Attribution: DaPino Webdesign, Lebreton, Asher Abbasi, Woothemes, Iconshock, Andy Gongea, FatCow from www.iconspedia.com

One of the great privileges of being the programme chair for Scandinavian Developer Conference is getting to choose the keynote speakers. This year, I’m delighted to present Dan North and Janice Fraser, both thought leaders in the field of software development. Although from different backgrounds and perspectives, they’re both accomplished at building software that delivers great business outcomes. I’d like to tell you a little about each person, and why I’ve invited them to Göteborg for SDC2013.

Dan North – a man full of intriguing ideas

I first came accross Dan North at a conference in 2007, talking about a topic I was very familiar with – unit testing – but using a whole new set of words. Behaviour Driven Development (BDD) intrigued me then, and still does now. How can switching the word “Test” for “Behaviour” and “AssertEqual” to “Should be” make such a difference to the way you end up designing your code?

In his famous article from 2006, “Introducing BDD” Dan explains that he found when he stopped talking about “Testing” and started instead used the word “Behaviour”, “… a whole category of coaching problems disappeared”. People understand more easily that defining the behaviour of the software is an important activity for the whole team, not just testers. It also changes the way you as a programmer think about your code, and helps you focus on what’s important.

BDD as an approach to software development is still being actively developed and written about, although Dan himself has largely stepped aside in favour of other thought leaders like Elizabeth Keogh, Chris Matts, Olav Maasen, and of course Gojko Adzic. Gojko wrote the hugely influential book “Specification by Example” which is all about having useful conversations about software behaviour, and expressing that in terms of executable examples – ie a lot like BDD. At about the time that book was being written, Dan himself chose a different road. He actually stepped out of the consultant life entirely for about two years, taking up a full time position developing software at a financial trading firm.

His latest ideas around “Accelerated Agile” to a large extent come from his experiences working in that high-powered trading environment. He wrote in his blog: “This team was the most insanely effective delivery machine I’ve ever been a part of”, and I find that particularly intriguing. He says that standard agile practices like Continuous Integration and maintaining a Product Backlog weren’t being used! So what exactly did they do in order to be so effective?

Dan has also famously opined that “Programming is not a Craft”, and argues that the “Software Craftsmanship Manifesto … [is] a spectacularly easy bandwagon to jump on”. He says he’d rather see, “…a call to arms to stop navel-gazing and treat programming as the skilled trade that it is.” So there. Dan certainly has some strong opinions, and when I’ve met him, he always seems to express himself with wit and intelligence.

I’m really looking forward to Dan’s keynote address on Tuesday 5th March. I’m intrigued to find out what “Patterns of Effective Delivery” is all about, and to hear his latest opinions on the practice of software development.

Janice Fraser – a pioneer of Lean User Experience

To a large extent, Silicon Valley is the epicenter of our whole industry. Many of the biggest and most influential software companies in the world are based there, and as an incubator with a friendly climate for startup companies, it is unparalleled, despite the efforts of many other regions around the globe to imitate their success.

Janice Fraser has been working in Silicon Valley for over 15 years, and she says in her CVI’ve seen a lot — bubbles, bursts, and fantastic acts of collaboration that have transformed literally billions of lives.” Yes, that’s right, billions of lives.

The latest trend coming out of the Valley is “Lean Startup”, a term coined by Eric Ries, and documented in his bestselling book “The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Business”.  I first heard about it in 2011, when Joshua Kerievsky, an early adopter of eXtreme Programming and successful entrepreneur, published an article “Agile vs. Lean Startup”. He says, “[Lean Startup] rocks. It rocks far more than Agile.” If it rocks far more than agile, then I find that pretty intriguing!

Janice Fraser is of course also an early adopter of “Lean Startup”, and has pointed out that the ideas in it are not all new. She saysThe Lean Startup, is a rediscovery of user centered design… [it] gives UX teams an unqualified mandate to make products customers love.

Janice herself is a serial entrepreneur, having led several startup companies. She says in her CV, “My proudest success is Adaptive Path, a leading product design firm. I was a founder and served as the company’s first CEO”. Adaptive Path is still successfully in business.

Janice is not shy about recounting her failures either, she’s written a candid report of how she started “Emmet Labs” in 2007 intending to change the world, right through to when she laid off all the staff in 2009. I think her article “7 things I did right with Emmet Labs” shows how much courage and determination it takes to build a company, and how resilient and clear-thinking Janice herself can be in a crisis.

Janice’s business these days is helping other startup companies to succeed: “Before you build anything, find customers, learn their needs & goals, and measure your progress towards your vision” – an extract from the marketing materials for her company, Luxr. She’s also just about to publish a book “The Lean Product Book: How Smart Teams Work Better”, which I guess will document the kind of advice she gives to her clients – all about Lean User Experience.

All this talk of startups and product development in Silicon Valley might seem a long way from chilly Göteborg and our IT industry, dominated by a few huge corporations. I think it’s just the kind of thing we need to hear about, though. Companies of any size need to renew themselves and develop great new products in order to flourish, and this is clearly an area where Janice is innovating and leading the world. I’m really looking forward to hear what she’s going to say in her keynote “Lean Startup Product Teams: Principles of Success”, on Monday 4th March.

The Cyber-Dojo tool was designed by Jon Jagger as en environment where you can practice your coding skills. I’ve used it a few times now with groups at coding dojos and code retreats, and I think it’s a pretty useful tool for those contexts. (See also my last post which talks about using Cyber-Dojo during Global Day of Code Retreat).

One of the advantages of Cyber-Dojo for a Coding Dojo, (or Code Retreat), is that you don’t waste much time at the start of a coding session setting up a coding environment. The session facilitator creates a Cyber-Dojo instance in advance, and puts the practice-id up on a whiteboard or projector where everyone can see it. Participants just point their browsers at cyber-dojo.com, enter the practice-id, and very quickly get coding.

Cyber-Dojo supports about a dozen programming languages, and has starting positions set up for about 30 code katas. What is less known, is that it also allows you to set up any kata or starting position you like. I thought I’d take this opportunity to create some documentation for this feature:

  1. create a new cyber-dojo instance by going to http://cyber-dojo.com/ and pressing “setup”
  2. Select the programming language you want to use
  3. Select “Verbal” from the list of katas
  4. Click “OK”, then make a note of the “practice-id” – it’s also in the url. Press “Start” to enter this cyber-dojo instance.
  5. Edit the code files with your starting position, and update the instructions with the details of the kata exercise. Basically get the cyber-dojo instance set up to the position you want people in your Coding Dojo to start from. Run the tests as often as you like until you have everyting as you want it.
  6. Click on the “fork” icon on the left hand side to create a new cyber-dojo instance starting from this position, and note the new practice-id. You can give this id to your Coding Dojo participants.
  7. You can also publish a url that will automatically create a new cyber-dojo instance from this position, so people can create their own cyber-dojo instances. The form of the url is:
http://cyber-dojo.com/forker/fork/(practice-id)?avatar=(your animal)&tag=(number of the traffic light to fork)

I’ve used this feature to set up a number of Refactoring katas in cyber-dojo, for example the Tennis kata:

Perhaps sometime Jon will add a page on cyber-dojo.com that lists these kinds of additional available starting posistions, (hint!), but for now, you’ll have to keep track of them yourself.

By the way, do let me know if you try out this Tennis Refactoring Kata in Cyber-Dojo and how you get on with it. I welcome comments on this blog or on my github repo.

On Saturday I was up in Stockholm facilitating my fourth code retreat for Valtech, and my second Global Day of Code Retreat. It seemed to go very well. I tried out a few new elements, which seemed to make it go even better than the previous ones, so I thought I’d talk about them here in my blog.

The first thing we changed was that we had quotas for men and women, and we were aiming for a 50/50 gender balance. About a month before the code retreat, we were fully booked, with 20 places each for men and women taken. In the end several people dropped out at the last minute, and there were slightly more men. Still, it was a much better balance than last year. I think it made for a healthier atmosphere during the day, and we encouraged more women to be active in the community generally.

The second thing we changed was I didn’t require people to do Game of Life, I suggested some other code katas as alternatives. Game of Life is an excellent kata for practicing skills like writing clean code, simple design, and TDD generally. It’s also really fun to be doing an excercise that thousands of other programmers are also working on. However, I realized that around 1/3 of the people who’d signed up for the event had already been to a previous code retreat, and might be getting bored with Game of Life. We’re there to have fun, after all!

As I’ve been working on my book, and running various coding dojos recently, I’ve been thinking hard about TDD and how to learn it. It’s a multifaceted skill, and some katas allow you to focus on particular aspects of it, which can make practice more rewarding. For a complete beginner to TDD, Game of Life is actually quite hard to get started with, I’ve seen a lot of people struggling with it.  So in my introduction I highlighted four other katas people could choose from, as well as Game of Life:

  • String Calculator – good for the TDD newbie, since it really leads you by the hand
  • Tennis – good for practicing refactoring
  • GildedRose – good for practicing writing really good tests (and refactoring)
  • TyrePressure – good for understanding SOLID principles

People seemed to generally appreciate having a choice. Some pairs chose a String Calculator for their first couple of sessions, to get used to TDD, then went on to Game of Life for the others. Some pairs tried out Tennis, Gilded Rose and Tyre Pressure in the afternoon, instead of one of the other challenges. Some pairs who were already good at TDD tried out String Calculator in Clojure, and found it let them concentrate on the learning the language not solving the problem. One person, (who has been to 3 code retreats previously), didn’t do Game of Life all day, and said he really enjoyed himself and learnt loads!

The third thing we changed was that I suggested people try cyber-dojo as a coding environment. This is a tool designed by Jon Jagger as an aid to practicing your coding skills. It provides a basic editor/testing environment, and records the state of the code every time you run the tests. You can review all your changes in the session retrospective, and get a picture of how well your TDD was going.

Before the code retreat, I had set up cyber dojos for Game of Life, and Tennis. I put up the cyber-dojo ids on the whiteboard, and through the day most of the pairs tried it out for one or more sessions.

As a facilitator, I found it really helpful when going up to a pair, to be able to see the little row of traffic light symbols at the top of their screen. I could quickly get an overview of how their session was going. For example, if I see a few red-green-yellow sequences then I can infer they are probably doing ok at TDD. If I can see nothing bug a long string of yellow, then I know they’re in trouble!

For the pairs, some liked cyber-dojo because it meant they didn’t waste any time setting up their coding environment at the start of the session. For those doing Tennis, if they made a refactoring mistake, they could very easily just click on the most recent green traffic light to revert back to a known good state, and practice doing the refactoring again. Some pairs found it helpful in the retrospective to review their session in the tool.

Other pairs didn’t get on so well with cyber-dojo. Some couldn’t live without their usual editor commands. Some got confused by the lack of syntax highlighting. A couple found a bug in cyber-dojo that it lost touch with the server and stopped updating their test results, whatever they did to the code. (The workaround is to open the same url in a new window, btw).

I know Jon is aware of this bug, and I’m sure he’ll track it down, but actually the lack of syntax highlighting and editor commands is deliberate. It’s the same idea as the other challenges, like mute pairing etc – to throw you out of your comfort zone and make you really concentrate on the way you code.

So that was the three things we changed – gender balance, other katas, cyber-dojo. What we didn’t change was the overall format or aims of the day. We still paired for 5 sessions of 45 minutes, and threw away the code afterwards. We still had coding challenges like “Mute pairing with find the loophole” and “Maximum 4 line methods”, and held retrospectives after every session. I think deliberate practice in a group is at the core of what makes Code Retreat (and coding dojos!) fun and valuable. The changes we introduced were all designed to enhance the learning experience, and were based on experience and reflection after previous events.

Overall I was very pleased with all the changes we introduced, and I’d like to do it like that again. If you’re facilitating or organizing a code retreat, perhaps you’d like to introduce some of these elements too.

This Code Kata is included in my new book “The Coding Dojo Handbook”, currently published as a work-in-progress on LeanPub.com. You can also download starting code and these instructions from my github page.

As a Health Insurer,
I want to be able to search for patients who have a medicine clash,
So that I can alert their doctors and get their prescriptions changed.

Health Insurance companies don’t always get such good press, but in this case, they actually do have your best interests at heart. Some medicines interact in unfortunate ways when they get into your body at the same time, and your doctor isn’t always alert enough to spot the clash when writing your prescriptions. Sometimes, medicine interactions are only identified years after the medicines become widely used, and your doctor might not be completely up to date. Your Health Insurer certainly wants you to stay healthy, so discovering a customers has a medicine clash and getting it corrected is good for business, and good for you!

For this Kata, you have a recently discovered medicine clash, and you want to look through a database of patient medicine and prescription records, to find if any need to be alerted to the problem. Create a “Patient” class, with a method “Clash” that takes as arguments a list of medicines, and how many days before today to consider, (defaults to the last 90 days). It should return a collection of days on which all the medicines were being taken during this time.

If you like, you can also create a visualization of the clash, something like this:

medicine_clash

Data Format

You can assume the data is in a database, which is accessed in the code via an object oriented domain model. The domain model is large and complex, but for this problem you can ignore all but the following entities and attributes:

TDDStatesMoves_003

In words, this shows that each Patient has a list of Medicines. Each Medicine has a list of Prescriptions. Each Prescription has a dispense date and a number of days supply.

You can assume:

  • Patients start taking the medicine on the dispense date.
  • The “days supply” tells you how many days they continue to take the medicine after the dispense date.
  • If they have two overlapping prescriptions for the same medicine, they stop taking the earlier one. Imagine they have mislaid the medicine they got from the first prescription when they start on the second prescription.

When you’ve tried the Kata for yourself

Then you might be interested in reviewing the sample solution I’ve put up on my github page. I find this code interesting because it is seemingly well written. The methods are short with thought-through names, and there are lots of unit tests. I also find it very difficult to follow. What do you think?

The biology of medicine clashes*

When you take a pill of medicine, the active substance will be absorbed through the lining of the gut, and enter your bloodstream. That means it will be taken all over your body, and can do its work. For example, if you take a headache pill, the active substance in the drug will be taken by your blood to where it can block your pain receptors. At the same time, there are enzymes at work in your liver, which break down medicinal substances they find in your bloodstream. Eventually all the medicine will be removed, so you have to take another pill if you want the effects to continue.

In the liver, there are several different enzymes working, and they are specialized in breaking down different substances. For example, the “CYP 2C9” enzyme will break down ibuprofen, the active ingredient in many headache pills. The trouble is, there are other medicines which will stop particular enzymes from doing their work, which can lead to an overdose or other ill effects.

One example is the clash between fluoxetine and codeine. Fluoxetine is known by its trade name “Prozac”, and is often taken for depression. Codeine is another ingredient used in headache pills, and is actually a “pro-drug”, so it works slightly differently. Codeine needs to be broken down in the liver by the enzyme “CYP 2D6” into the active substance, morphine, before it will do anything. Fluoxetine has the effect of blocking “CYP 2D6”, so if you take the two medicines together, you won’t get much painkilling effect from the codeine. That could be depressing!

The solution to the problem is to take a different painkiller – one that’s not affected by that liver enzyme. Simply switch codeine for ibuprofen, and you should be be a little happier.

* With thanks to Sara Sjöberg for helping me with this section