Posts tagged ‘agile’

Programmers have a vested interest in making sure the software they create does what they think it does. When I’m coding I prefer to work in the context of feedback from automated tests, that help me to keep track of what works and how far I’ve got. I’ve written before about Test Driven Development, (TDD). In this article I’d like to explain some of the main features of Text-Based Testing. It’s a variant on TDD, perhaps more suited to the functional level than unit tests, and which I’ve found powerful and productive to use.


The basic idea
You get your program to produce a plain text file that documents all the important things that it does. A log, if you will. You run the program and store this text as a “golden copy” of the output. You create from this a Text-Based Test with a descriptive name, any inputs you gave to the program, and the golden copy of the textual output.

You make some changes to your program, and you run it again, gathering the new text produced. You compare the text with the golden copy, and if they are identical, the test passes. If the there is a difference, the test fails. If you look at the diff, and you like the new text better than the old text, you update your golden copy, and the test is passing once again.

Tool Support
Text-Based Testing is a simple idea, and in fact many people do it already in their unit tests. AssertEquals(String expected, String actual) is actually a form of it. You often create the “expected” string based on the actual output of the program, (although purists will write the whole assert before they execute the test).

Most unit test tools these days give you a nice diff even on multi-line strings. For example:

download
download (1)

Which is a failing text-based test using JUnit.

Once your strings get very long, to the scale of whole log files, even multi-line diffs aren’t really enough. You get datestamps, process ids and other stuff that changes every run, hashmaps with indeterminate order, etc. It gets tedious to deal with all this on a test-by-test basis.

My husband, Geoff Bache, has created a tool called “TextTest” to support Text-Based testing. Amongst other things, it helps you organize and run your text-based tests, and filter the text before you compare it. It’s free, open source, and of course used to test itself. (Eats own dog food!) TextTest is used extensively within Jeppesen Systems, (Geoff works for them, and they support development), and I’ve used it too on various projects in other organizations.

In the rest of this article I’ll look at some of the main implications of using a Text-Based Testing approach, and some of my experiences.

Little code per test
The biggest advantage of the approach, is that you tend to write very little unique code for each test. You generally access the application through a public interface as a user would, often a command line interface or (web)service call. You then create many tests by for example varying the command line options or request contents. This reduces test maintenance work, since you have less test code to worry about, and the public API of your program should change relatively infrequently.

Legacy code
Text-Based Testing is obviously a regression testing technique. You’re checking the code still does what it did before, by checking the log is the same. So these tests are perfect for refactoring. As you move around the code, the log statements move too, and your tests stay green, (so long as you don’t make any mistakes!) In most systems, it’s cheap and risk-free to add log statements, no matter how horribly gnarly the design is. So text-based testing is an easy way to get some initial tests in place to lean on while refactoring. I’ve used it this way fairly successfully to get legacy code under control, particularly if the code already produces a meaningful log or textual output.


No help with your design
I just told you how good Text-Based Testing is with Legacy code. But actually these tests give you very little help with the internal design of your program. With normal TDD, the activity of creating unit tests at least forces you to decompose your design into units, and if you do it well, you’ll find these tests giving you all sorts of feedback about your design. Text-Based tests don’t. Log statements don’t care if they’re in the middle of a long horrible method or if they’re spread around several smaller ones. So you have to get feedback on your design some other way.

I usually work with TDD at the unit level in combination with Text-Based tests at the functional level. I think it gives me the best of both worlds.

Log statements and readability
Some people complain that log statements reduce the readability of their code and don’t like to add any at all. They seem to be out of fashion, just like comments. The idea is that all the important ideas should be expressed in the class and method names, and logs and comments just clutter up the important stuff. I agree to an extent, you can definitely over-use logs and comments. I think a few well placed ones can make all the difference though. For Text-Based Testing purposes, you don’t want a log that is megabytes and megabytes of junk, listing every time you enter and leave every method, and the values of every variable. That’s going to seriously hinder your refactoring, apart from being a nightmare to store and update.

What we’re talking about here is targeted log statements at the points when something important happens, that we want to make sure should continue happening. You can think about it like the asserts you make in unit tests. You don’t assert everything, just what’s important. In my experience less than two percent of the lines of code end up being log statements, and if anything, they increase readability.

Text-Based tests are completed after the code
In normal TDD you write the test first, and thereby set up a mini pull system for the functionality you need. It’s lean, it forces you to focus on the problem you’re trying to solve before you solve it, and starts giving you feedback before you commit to an implementation. With Text-Based Testing, you often find it’s too much work the specify the log up front. It’s much easier to wait until you’ve implemented the feature, run the test, and save the log afterwards.

So your tests usually aren’t completed until after the code they test, unlike in normal TDD. Having said that, I would argue that you can still do a form of TDD with Text-Based Tests. I’d normally create the half the test before the code. I name the test, and find suitable inputs that should provoke the behaviour I need to implement in the system. The test will fail the first time I run it. In this way I think I get many of the benefits of TDD, but only actually pin down the exact assertion once the functionality is working.

“Expert Reads Output” Antipattern
If you’re relying on a diff in the logs to tell you when your program is broken, you had better have good logs! But who decides what to log? Who checks the “golden copy”? Usually it is the person creating the test, who should look through the log and check everything is in order the first time. Of course, after a test is created, every time it fails you have to make a decision whether to update the golden copy of the log. You might make a mistake. There’s a well known antipattern called “Expert Reads Output” which basically says that you shouldn’t rely on having someone check the results of your tests by eye.

This is actually a problem with any automated testing approach – someone has to make a judgement about what to do when a test fails – whether the test is wrong or there’s a bug in the application. With Text-Based Testing you might have a larger quantity of text to read through compared with other approaches, or maybe not. If you have human-readable, concise, targeted log statements and good tools for working with them, it goes a long way. You need a good diff tool, version control, and some way of grouping similar changes. It’s also useful to have some sanity checks. For example TextTest can easily search for regular expressions in the log and warn you if you try to save a golden copy containing a stack trace for example.

In my experience, you do need to update the golden copy quite often. I think this is one of the key skills with a Text-Based Testing approach. You have to learn to write good logs, and to be disciplined about either doing refactoring or adding functionality, not both at the same time. If you’re refactoring and the logs change, you need to be able to quickly recognize if it’s ok, or if you made a mistake. Similarly, if you add new functionality and no logs change, that could be a problem.

Agile Tests Manage Behaviour
When you create a unit test, you end with an Assert statement. This is supposed to be some kind of universal truth that should always be valid, or else there is a big problem. Particularly for functional level tests, it can be hard to find these kinds of invariants. What is correct today might be updated next week when the market moves or the product owner changes their mind. With Text-Based Testing you have an opportunity to quickly and easily update the golden copy every time the test “fails”. This makes your tests much more about keeping control of what your app does over time, and less about rewriting assert statements.

Text-Based Testing grew up in the domain of optimizing logistics planning. In this domain there is no “correct” answer you can predict in advance and assert. Planning problems that are interesting to solve are far too complex for a complete mathematical analysis, and the code relies on heuristics and fancy algorithms to come up with better and better solutions. So Text-Based Testing makes it easy to spot when the test produces a different plan from before, and use it as the new baseline if it’s an improvement.

I think generally it leads to more “agile” tests. They can easily respond to changes in the business requirements.

Conclusions
There is undoubtedly a lot more to be said about Text-Based Testing. I havn’t mentioned text-based mocking, data-driven vs workflow testing, or how to handle databases and GUIs – all relevant topics. I hope this article has given you a flavour of how it’s different from ordinary TDD, though. I’ve found that good tool support is pretty essential to making Text-Based Testing work well, and that it’s a particularly good technique for handling legacy code, although not exclusively. I like the approach because it minimizes the amount of code per test, and makes it easy to keep the tests in sync with the current behaviour of the system.

 

I’m speaking next week at ScanDev on Tour in Stockholm on the subject of “Software Development Craftsmanship”, and as part of my research I read both “The Clean Coder” by Robert C. Martin and “Apprenticeship Patterns” by Dave Hoover & Adewale Oshineye. These are very different books, but both aimed at less experienced software developers who want to learn about what it means to be a professional in the field. In this article I’d like to review them side by side. First some text from each preface on what the authors think the books are about:

Apprenticeship Patterns

“This book should help you through the tough decisions you face as a newcomer to the field of professional software development. “ (preface xi)

The Clean Coder

“This book is about software professionalism. It contains a lot of pragmatic advice” (preface xxii)

The Content
Both books contain a lot of personal stories and anecdotes from the authors’ careers, and begin with a short autobiography. Some of the advice is also similar. Both advise you to practice with “Kata” exercises, to read widely and to find suitable mentors. I think that’s mostly where the similarities end though.

Dave and Ade don’t say much about how to handle unreasonable managers imposing impossible deadlines. Bob Martin devotes a several chapters to this kind of issue, handling pressure, time management, estimation, making committments etc.

Dave and Ade talk more about how to get yourself into situations optimized for learning and progress in your career. They advise you to “Be the worst”, “Find mentors”, seek “Kindred Spirits”. In other words, join a team where you’re the least skilled but you’ll be taught, look for mentors in many places, and get involved in the community.

Bob talks about a lot of specific practices and has detailed advice. He mentions “… pairing is the most efficient way to solve a problem” (p164) Later in the chapter he suggests the optimal composition of job roles in a gelled team. (p169) He also has some advice about how to successfully argue with your boss and go over their head when necessary (p35).

The Advice
Those few example perhaps illustrate that these two books are miles apart when it comes to writing style, approach and world view. Dave&Ade have clearly spent a lot of time talking with other professionals about their material, acting on feedback and testing their ideas for validity in real situations. The book is highly collaborative and while full of advice, is not prescriptive.

Bob Martin on the other hand loves to be specific, provocative and extreme in his advice. “QA should find nothing.”(p114) “You should plan on working 60 hours per week.” (p16) “Avoid the Zone.” (p62) “The jury is in! … TDD works” (p79) These are some of his more suprising pieces of advice, which I think are actually fairly doubtful propositions when taken to extremes like this. Mixed in are more reasonable statements. “You do not have to attend every meeting to which you are invited” (p123) “The professional developer is calm and decisive under pressure”. (p150)

The way everything is presented as black-and-white, do-or-do-not-there-is-no-try is actually pretty wearing after a while. He does it to try to make you think, as a rhetorical device, to promote healthy discussion. I think it all too easily leads the reader to throw the baby out with the bathwater. I can’t accept one of his recommendations, so I throw them all out.

Some of Dave&Ade’s advice is actually just as hard to put into practice. Each of their patterns is followed by a call to action. Things like re-implementing a program you’ve written in an imperative language in a functional language (p21). Join or start a user group (p65). Solve the same coding exercise once a week for the next four weeks (p79). None of these things is particularly easy to do, but they seem to me to be interesting and useful challenges.


Collaboration
Bob has also clearly not collaborated very widely when preparing his material. One part that particularly sticks out for me is a footnote on page 75:

“I had a wonderful conversation with @desi (Desi McAdam, founder of DevChix) about what motivates women programmers. I told her that when I got a program working, it was like slaying the great beast. She told me that for her and other women she had spoken to, the act of writing code was an act of nurturing creation.” (footnote, p75)

Has he ever actually run his “programming is slaying a great beast” thing past any other male programmers? Let me qualify that – non-fantasy-role-playing male programmers? Thought not. This is in enormous contrast to Dave&Ade, whose book is full of stories from other people backing up their claims.

Stories
Bob’s book is full of stories from his own career, and he is very honest and open about his failures. This is a very brave thing to do, and I have a great deal of respect for him for doing so. It’s also really interesting to hear about the history of what life was like when computers filled a room and people used punch cards to program them. Dave&Ades stories are less compelling and not always as well written.

Bob’s book is not just about his professional life, he shares his likes and dislikes. He reccommends cycling or walking to recharge your energy, or “focus-manna” as he calls it, (p127). Reading science fiction as a cure for writer’s block. (p66) Listening to “The Wall” while coding could be bad for your design. (p63) When describing “Master” programmers he likens them to Scotty from Star Trek. (p182)

All this is very cute and gives you a more rounded picture of what software professionalism is about. Maybe. Actually it really puts me off the idea. I know a lot of software developers like science fiction and fantasy role playing, but it really isn’t mandatory. He usually says that you may have other preferences, and you don’t have to do like he does, but I just don’t think it helps all that much. The rest of the book is highly dogmatic about what you should and shouldn’t do, and it kind of rubs off.

Conclusions
The bottom line is, I wouldn’t reccommend “The Clean Coder” to any young inexperienced software developer, particularly not if she were a woman. There is too much of it written from a foreign culture, in a demanding tone, propounding overly extreme behaviour. The interesting stories and good pieces of advice are drowned out.

On the other hand, I would recommend “Apprenticeship Patterns”. I think it is humbly written and anchored in real experience from a range of people. I agree with them when they say you need to read it twice to understand it. The first time to get an overview, the second time to understand how the patterns connect. It’s not as easy to read as it might be. But still, I think the content is interesting, and it gives a good introduction to what being a professional software craftsman is about, and how to get there.

This is the second time I’ve attended Nordic Ruby, you can read about what I thought last year here. This year I enjoyed the conference more, for several reasons. There were some small changes in the way it was organized, (on a Friday and Saturday instead of taking up a whole weekend), a better choice of speakers and topics, (less technical, more inspirational), and I knew more of the people there.

One of the themes of the conference was diversity, which I was very, very happy to see. There was an inspiring talk by Joshua Wehner about this topic, taking up some depressing statistics about the IT industry in general and open source software in particular. What struck me most was that he said the statistics for women involvement are improving in many formerly male-dominated disciplines, like maths, physics and law, but in computing, the situation was actually better 20 years ago than it is now. The curves are pointing the wrong way in our industry.

Having said that, there were slightly more women at the conference this year than last, I think I counted 4 of 150, compared with 2 of 90 last year. There were also far fewer references to science fiction movies from the speakers this year 😉

Joshua did take up several things that we could do practically to reduce bias and positively encourage diversity. He’s written about some of them in this blog post. Another one he mentioned that I liked was the “no asshole rule”. If people engage in arrogant one-upmanship, talk down to others, and emphasize their superior programming abilities, they should be regarded as not just annoying, but actually incompetent. Developing software is a multi-faceted skill, and it takes a lot more than just writing good code to be a good software developer.

Joe O’Brien continued the diversity theme in his talk “Taking back education” by basically arguing that having a degree in computer science correlates very badly with being a good software developer, and that we should be finding ways to bring people into our industry who have non-traditional backgrounds. He advocated companies to start apprenticeship programmes, while conceding that this model of education doesn’t scale very well. He talked about getting a group of companies together to set up a “code school”. He said “forget universities when it comes to education [of software developers]. We’re better at it”

I applaud his efforts to bring a more diverse range of people into the industry, and I think my recent experiences teaching a group like this are relevant. I think I’ll write a separate blog post about that experience, but basically I think the idea of a “code school” is a good one, and similar institutions probably already exist, and could add a course in software development to their programme of courses in practical skills. For this to happen it’s up to companies to put in time and energy setting them up, rather than just complaining that when they put out a job advert, all they get are white male applicants between the ages of 25-35, so it’s not their fault.

Another talk that deserves a mention is the one by Joseph Wilk. He spoke about “The Limited Red Society” which is an idea that Joshua Kerievsky came up with. I heard Joshua speak about it at XP2009, and I thought Joseph did a very good job of explaining what it is, and why it’s important.

Basically the idea is that although you need your tests to go red during TDD, if they stay red for any length of time, it can get you into trouble. While they are red, you can’t check in, ship your code, or change to working on a different task. This is one motivation for trying to measure, and limit, how much of the time your tests are red. It’s also about more generally improving the feedback we get for ourselves while we work. Professional sports stars spend time analysing and visualizing their performances (where balls land on a tennis court, footballers rates of passing etc). We programmers could benefit from that kind of thing too.

Joseph has invented a tool that helps him to track his state when doing TDD. It’s a simple monitoring program that makes a note every time he runs his tests. It’s not as elaborate as the commercial tool offered by Joshua Kerievsky’s company, but it does work with Ruby and Cucumber. Joseph also has his tool connected to his CI server so that it runs tests that have failed recently in his and others’ checkouts first in the CI test run. He also gathers statistics about individual tests, how often they fail, and whether they are fixed without the production code needing to be changed – a way of spotting fragile tests.

I think this kind of statistics gathering is really interesting and I think Joseph will just have more insights to share as he gathers more data and does more analysis. I’ve been experimenting with the tool provided by codersdojo.org for measuring my performance at code katas, but Joseph seems to be taking this all to the next level.

Overall I thoroughly enjoyed Nordic Ruby. (I still think it would be improved by some actual open space sessions though). I talked to loads of really interesting people, enjoyed good food and drink in comfortable surroundings, and listened to some people give excellent talks. Thanks for organizing a great conference, Elabs.

I’ve just been appointed to the role of Industry Programme Chair for XP2011, which will be held in Madrid in May. I’ve been to 7 of the previous 11 XP conferences and I am so pleased to be asked to contribute to the success of the conference this year by doing this role.

Rachel Davies is the general chair, and I am really looking forward to working with her and the other organizers. Rachel is one of those people I have met repeatedly at conferences and always has something interesting to contribute. More recently, I read her excellent book on Agile Coaching. I can’t remember exactly when I first met her, but I do remember meeting her former colleagues from Connextra, Tim MacKinnon and Steve Freeman. I can still picture them in the small minibus that picked us up from a tiny Italian airport in 2002. It was a hot summers day, and we were driven at high speed along small Sardinian roads to the lovely hotel Calabona by the sea and the historic walled city of Alghero. I remember being so impressed to meet some people who were actually doing eXtreme Programming for real.

There were so many inspirational people at that conference, it was really a turning point in my career. I just found the old conference programme online here, and it brings back so many memories!

I remember sitting by the pool discussing subjects like how to test drive refactoring with Frank Westphal and Steve Freeman. There was a firey keynote from Ken Schwaber encouraging us to start a revolution in software development world. I remember Joshua Kerievsky asking Jutta Eckstein to explain all about how she was doing XP with a team of over 100 people. Following David Parnas’ keynote about using a formal test specification language to define requirements I remember Martin Fowler opining about its usefulness or lack of it, (do read his blog post about it).

The colourful personality of Scott Ambler demonstrated his ability to break a plank of wood in two with his bare hands, as some kind of lesson to do with dedication and focus. The conference dinner at Poco Loco really was a little crazy, with a bunch of uncoordinated geeks going for it on the dancefloor while the local band played very loudly. The morning after everyone was rather subdued when listening to Enrico Zaninotto reading his keynote in halting English, relating XP to the history of manufacturing and modern lean ideals. Half the audience was having trouble staying awake which in no way reflected the quality of what he was saying. It was truly inspiring, and Mary and Tom Poppendieck in particular were listening in rapt attention.

Michael Feathers wore a T-shirt saying “Save the LSF”, and Geoff and I asked him why he was so interested in platform computing’s Load Sharing Facility. It turned out Alan Francis had recently become unemployed and Mike was helping in the campaign to “Save the Lightly Scottish Fellow”!

Laurent Bossavit was going round trying to attract people to his Birds Of a Feather session on the writings of Gerald Weinberger. Erik Lundh was taking about his team in Sweden who had done a complete XP iteration in 2 days when faced with an unexpected deadline. Steven Fraser seemed to be videoing everything and anything, including someone demonstrating the correct way to twirl Italian Spaghetti on a fork. Mike Hill was (as ever) being loud but friendly. Charlie Poole seemed to be full of insightful analogies and comments. Dave Hussman was really friendly too.

It was just fantastic the way the XP community welcomed us in, and particularly Kent Beck’s attitude was instrumental in that. My husband Geoff and I presented a poster at the conference with title “One suite of automated tests” based mainly on Geoff’s experiences with the tool that was to become TextTest. We turned up on the first day for a workshop about “testing in XP”, and Geoff was immediately controversial by saying that he didn’t do any unit testing, only this weird text-based testing thing using log comparison. He said he found it so successful that he used it instead of both the XP practices of functional and unit testing. I remember several people being quite dismissive of his ideas.

Later in the conference, Kent Beck made a particular effort to talk to us and we took our picture standing by our poster. Apparently he had been asking people to try to be inclusive and friendly to us after the somewhat negative reaction to our ideas. I think he wanted the newly-forming XP community to be welcoming and to embrace diversity of opinion.

So now I should turn this around and look instead to the future. I’d love it if even half the people I’ve mentioned in this post found time in their diaries to come to Madrid in May for XP2011. I wouldn’t want them to come alone though, there are so many fantastic and inspirational people who have joined the ever-expanding agile community since 2002.

I am every bit as keen now as Kent was then to see that the agile community embraces newcomers, and that the XP conference should provide a space where researchers and practitioners can freely discuss the state of the art. I hope we’ll make new friends and business contacts, learn loads, and have fun. Would you like to join us there?

I recently read this post in Brian Marick’s blog, and it set me thinking. He’s talking about a test whose intention in some way survived three major GUI revisions. The test code had to be rewritten each time, but the essence of it was retained. He says:

I changed the UI of an app and so
 I had to rewrite the UI code and the tests/checks of the UI code. It might seem the checks were worthless during the rewrite (and in 1997, I would have thought so). But it turns out that all (or almost all) of those checks contained an idea that needed to be preserved in the new interface. They were reminders of important things to think about, and that (it seemed to me) repaid the cost of rewriting them.

That was a big surprise to me.

I’m not sure why Brian is so surprised about this. If the user intentions and business rules are the same, then some aspects of the tests should also be preserved. A change in UI layout or technology should mean superficial changes only. In fact, one of the main claims for PyUseCase is that by having the tests written in a domain language decoupled from the specifics of the UI, it enables you to write tests that survive major UI changes. In practice this means when you rewrite the UI, you are saved the trouble of also rewriting the tests. So Geoff and I decided to write some code and see if this was true for the example Brian outlines.

In the blog post, there is only one small screenshot and some vague descriptions of the GUIs these tests are for, so we did some interpolation. I hope we have written an application that gets to the gist of the problem, although it is undoubtedly less beautiful and sophisticated than the one Brian was working on. All the code and tests is on launchpad here.

We started by writing an application which I hope is like his current GUI. You select animals in a list, click “book” and they appear in a new list below. You select procedures from another list, and unsuitable animals disappear.

In my app, I had to make up some procedures, in this case “milking”, which is unsuitable for Guicho (no udders on a gelding!), and “abdominocentesis” which is suitable for all animals, (no idea what that is, but it was in Brian’s example :-). Brian describes a test where an animal that is booked should not stay booked if you choose a procedure that is unsuitable for it, then change your mind and instead choose a procedure that it is suitable for.

select animals Guicho
book selected animals
choose procedure milking
choose procedure abdominocentesis
quit

This is a list of the actions the user must take in the GUI. So Guicho should disappear when you select “milking”, and reappear as available, but not as booked, when you select “abdominocentesis”. This information is not in the use case file, since it only documents user actions.

The other part of the test is the UI log, which documents what the application actually does in response to the user actions. This log is auto generated by pyUseCase. For this test, I won’t repeat the whole file, (you can view it here), but I will go through the important parts:

‘select animals’ event created with arguments ‘Guicho’

‘book selected animals’ event created with arguments ”

Updated : booked animals with columns: booked animals ,
-> Guicho | gelding

This part of the log shows that Guido is listed as booked.

‘choose procedure’ event created with arguments ‘milking’

Updated : available animals with columns: available animals , animal type
-> Good Morning Sunshine | mare
-> Goat 3 | goat
-> Goat 4 | goat
-> Misty | mare

Updated : booked animals with columns: booked animals ,

So you see that after we select “milking” the lists of available and booked animals are updated, Guicho disappears, and the “booked animals” section is now blank. The log goes on to show what happens when we select “abdominocentesis”:

‘choose procedure’ event created with arguments ‘abdominocentesis’

Updated : available animals with columns: available animals , animal type
-> Good Morning Sunshine | mare
-> Goat 3 | goat
-> Goat 4 | goat
-> Guicho | gelding
-> Misty | mare

‘quit’ event created with arguments ”

ie the “available animals” list is updated and Guicho reappears, but the booked animals list is not updated. This means we know the application behaves as desired – booked animals that are not suitable for a procedure do not reappear as booked if another procedure is selected.

Ok, so far so good. What happens to the test when we compeletely re-jig the UI and it instead looks like this?

Now there is no book button, and you book animals by ticking a checkbox. Selecting a procedure will remove unsuitable animals from the list in the same way as before. So now if you change your mind about the procedure, animals that reappear on the list should not be marked as booked, even if they were before they disappeared. There is no separate list of booked animals.

What we did was take a copy of the tests and the code, updated the code, and see what we needed to do to the tests to make them work again. In the end it was reasonably straightforward. We didn’t re-record or rewrite any tests. We just had to modify the use cases to remove the reference to the book button, and save new versions of the UI log to reflect the new UI layout. The use case part of the test looks like this now:

book animal Guicho
choose procedure milking
choose procedure abdominocentesis
quit

which is one line shorter than before, since we no longer have separate user actions for selecting and booking an animal.

So updating the tests to work with the changed UI consisted of:

  1. remove reference to “book” button in UI map file, since button no longer exists
  2. in use case files for all tests, replace “select animals x, y” with a line for each animal, “book animal x” and “book animal y”.
  3. Run the tests. All fail in identical manner. Check the changes in the UI log file using a graphical diff tool, once. (no need to look at every test since they are grouped together as identical by TextTest)
  4. Save the updated use cases and UI logs. (the spurious line “book selected animals” is removed from the use case files since the button no longer exists)
  5. Run the tests again. All pass.

The new UI log file looks like this:

‘book animal’ event created with arguments ‘Guicho’

Updated : available animals with columns: is booked , available animals , animal type
-> Check box | Good Morning Sunshine | mare
-> Check box | Goat 3 | goat
-> Check box | Goat 4 | goat
-> Check box (checked) | Guicho | gelding
-> Check box | Misty | mare

‘choose procedure’ event created with arguments ‘milking’

Updated : available animals with columns: is booked , available animals , animal type
-> Check box | Good Morning Sunshine | mare
-> Check box | Goat 3 | goat
-> Check box | Goat 4 | goat
-> Check box | Misty | mare

‘choose procedure’ event created with arguments ‘abdominocentesis’

Updated : available animals with columns: is booked , available animals , animal type
-> Check box | Good Morning Sunshine | mare
-> Check box | Goat 3 | goat
-> Check box | Goat 4 | goat
-> Check box | Guicho | gelding
-> Check box | Misty | mare

‘quit’ event created with arguments ”

It is quite explicit that Guicho is marked as booked before he disappears, and not checked when he comes back. Updating the UI map file was very easy – we viewed it in a graphical diff tool, noted the new column for the checkbox and the lack of the list of booked animals were as expected, and clicked “save” in TextTest.

I only actually had like 5 tests, but updating them to cope with the changed UI was relatively straightforward, and would still have been straightforward even if I had had 600 of them.

I’m quite pleased the way PyUseCase coped in this case. I really believe that with this tool you will be able to write your tests once, and they will be able to survive many generations of your UI. I think this toy example goes some way to showing how.