Posts tagged ‘TextTest’

I recently read this post in Brian Marick’s blog, and it set me thinking. He’s talking about a test whose intention in some way survived three major GUI revisions. The test code had to be rewritten each time, but the essence of it was retained. He says:

I changed the UI of an app and so
 I had to rewrite the UI code and the tests/checks of the UI code. It might seem the checks were worthless during the rewrite (and in 1997, I would have thought so). But it turns out that all (or almost all) of those checks contained an idea that needed to be preserved in the new interface. They were reminders of important things to think about, and that (it seemed to me) repaid the cost of rewriting them.

That was a big surprise to me.

I’m not sure why Brian is so surprised about this. If the user intentions and business rules are the same, then some aspects of the tests should also be preserved. A change in UI layout or technology should mean superficial changes only. In fact, one of the main claims for PyUseCase is that by having the tests written in a domain language decoupled from the specifics of the UI, it enables you to write tests that survive major UI changes. In practice this means when you rewrite the UI, you are saved the trouble of also rewriting the tests. So Geoff and I decided to write some code and see if this was true for the example Brian outlines.

In the blog post, there is only one small screenshot and some vague descriptions of the GUIs these tests are for, so we did some interpolation. I hope we have written an application that gets to the gist of the problem, although it is undoubtedly less beautiful and sophisticated than the one Brian was working on. All the code and tests is on launchpad here.

We started by writing an application which I hope is like his current GUI. You select animals in a list, click “book” and they appear in a new list below. You select procedures from another list, and unsuitable animals disappear.

In my app, I had to make up some procedures, in this case “milking”, which is unsuitable for Guicho (no udders on a gelding!), and “abdominocentesis” which is suitable for all animals, (no idea what that is, but it was in Brian’s example :-). Brian describes a test where an animal that is booked should not stay booked if you choose a procedure that is unsuitable for it, then change your mind and instead choose a procedure that it is suitable for.

select animals Guicho
book selected animals
choose procedure milking
choose procedure abdominocentesis
quit

This is a list of the actions the user must take in the GUI. So Guicho should disappear when you select “milking”, and reappear as available, but not as booked, when you select “abdominocentesis”. This information is not in the use case file, since it only documents user actions.

The other part of the test is the UI log, which documents what the application actually does in response to the user actions. This log is auto generated by pyUseCase. For this test, I won’t repeat the whole file, (you can view it here), but I will go through the important parts:

‘select animals’ event created with arguments ‘Guicho’

‘book selected animals’ event created with arguments ”

Updated : booked animals with columns: booked animals ,
-> Guicho | gelding

This part of the log shows that Guido is listed as booked.

‘choose procedure’ event created with arguments ‘milking’

Updated : available animals with columns: available animals , animal type
-> Good Morning Sunshine | mare
-> Goat 3 | goat
-> Goat 4 | goat
-> Misty | mare

Updated : booked animals with columns: booked animals ,

So you see that after we select “milking” the lists of available and booked animals are updated, Guicho disappears, and the “booked animals” section is now blank. The log goes on to show what happens when we select “abdominocentesis”:

‘choose procedure’ event created with arguments ‘abdominocentesis’

Updated : available animals with columns: available animals , animal type
-> Good Morning Sunshine | mare
-> Goat 3 | goat
-> Goat 4 | goat
-> Guicho | gelding
-> Misty | mare

‘quit’ event created with arguments ”

ie the “available animals” list is updated and Guicho reappears, but the booked animals list is not updated. This means we know the application behaves as desired – booked animals that are not suitable for a procedure do not reappear as booked if another procedure is selected.

Ok, so far so good. What happens to the test when we compeletely re-jig the UI and it instead looks like this?

Now there is no book button, and you book animals by ticking a checkbox. Selecting a procedure will remove unsuitable animals from the list in the same way as before. So now if you change your mind about the procedure, animals that reappear on the list should not be marked as booked, even if they were before they disappeared. There is no separate list of booked animals.

What we did was take a copy of the tests and the code, updated the code, and see what we needed to do to the tests to make them work again. In the end it was reasonably straightforward. We didn’t re-record or rewrite any tests. We just had to modify the use cases to remove the reference to the book button, and save new versions of the UI log to reflect the new UI layout. The use case part of the test looks like this now:

book animal Guicho
choose procedure milking
choose procedure abdominocentesis
quit

which is one line shorter than before, since we no longer have separate user actions for selecting and booking an animal.

So updating the tests to work with the changed UI consisted of:

  1. remove reference to “book” button in UI map file, since button no longer exists
  2. in use case files for all tests, replace “select animals x, y” with a line for each animal, “book animal x” and “book animal y”.
  3. Run the tests. All fail in identical manner. Check the changes in the UI log file using a graphical diff tool, once. (no need to look at every test since they are grouped together as identical by TextTest)
  4. Save the updated use cases and UI logs. (the spurious line “book selected animals” is removed from the use case files since the button no longer exists)
  5. Run the tests again. All pass.

The new UI log file looks like this:

‘book animal’ event created with arguments ‘Guicho’

Updated : available animals with columns: is booked , available animals , animal type
-> Check box | Good Morning Sunshine | mare
-> Check box | Goat 3 | goat
-> Check box | Goat 4 | goat
-> Check box (checked) | Guicho | gelding
-> Check box | Misty | mare

‘choose procedure’ event created with arguments ‘milking’

Updated : available animals with columns: is booked , available animals , animal type
-> Check box | Good Morning Sunshine | mare
-> Check box | Goat 3 | goat
-> Check box | Goat 4 | goat
-> Check box | Misty | mare

‘choose procedure’ event created with arguments ‘abdominocentesis’

Updated : available animals with columns: is booked , available animals , animal type
-> Check box | Good Morning Sunshine | mare
-> Check box | Goat 3 | goat
-> Check box | Goat 4 | goat
-> Check box | Guicho | gelding
-> Check box | Misty | mare

‘quit’ event created with arguments ”

It is quite explicit that Guicho is marked as booked before he disappears, and not checked when he comes back. Updating the UI map file was very easy – we viewed it in a graphical diff tool, noted the new column for the checkbox and the lack of the list of booked animals were as expected, and clicked “save” in TextTest.

I only actually had like 5 tests, but updating them to cope with the changed UI was relatively straightforward, and would still have been straightforward even if I had had 600 of them.

I’m quite pleased the way PyUseCase coped in this case. I really believe that with this tool you will be able to write your tests once, and they will be able to survive many generations of your UI. I think this toy example goes some way to showing how.

Geoff has been working really hard for the past few months, writing pyUseCase 3.0. It has some very substantial improvements over previous versions, and I am very excited about it. He’s written about how it works here.

It’s a tool for testing GUIs with a record-replay paradigm, that actually works. Seriously, you can do agile development with these tests, they don’t break the minute you change your GUI. The reason for this is that the tests are written in a high level domain language, decoupled from the actual current layout of your GUI. The tool lets you create and maintain a mapping file from the current widgets to the domain language, and helps you to keep it up to date.

In a way it’s a bit like Robot, or Twist, or Cucumber, that your tests end up being very human readable. The main difference is the record-replay capability. Anyone who can use the application GUI can create a test, which they can run straight away. With these other tools, a programmer typically has to go away and map the user domain language of the test into something that actually executes.

The other main way in which pyUseCase is different from other tools, is the way it checks your application did the right thing. Instead of the test writer having to choose some aspects of the GUI and make assertions about what they should look like, pyUseCase just records what the whole GUI looks like, in a plain text log. Then you can use TextTest to compare the log you get today with the one you originally recorded when you created the test. The test writer can concentrate on just normal interaction with the GUI, and still have very comprehensive assertions built into the tests they create.

pyUseCase, together with TextTest, makes it really easy to create automated tests, without writing code, that are straightforward to maintain, and readable by application users. Geoff has been developing his approach to testing for nearly a decade, and I think it is mature enough now, and sufficiently far ahead of the competition, that it is going to transform the way we do agile testing.

😀

It’s Java Forum next week, here in Göteborg. I’m giving a short talk about TestNG, a tool I’ve been using lately.

My basic conclusion is that TestNG is a very easy step from JUnit, and one you don’t need to take if all your tests are true unit tests (ie fast and independant). TestNG has some nice features which help when your tests are slow and/or have external dependencies, especially if they are mixed together in the same test classes as true unit tests. I think it’s pretty useful for unit and integration tests. (aka quadrant 1, technology facing).

Having said that, what bothers me about TestNG is that it means your test code is written in Java. For me, that makes it unsuitable for for system tests, (aka quadrant 2, business facing). If you have anything resembling an involved customer, you’re going to at least want to encourage them to read the system tests to verify they are correct, and to gain confidence that the system is working. Truly agile teams have these people helping write tests. Many customer types won’t be happy working with Java. You might be able to get by, though, if you have descriptive test names, good javadoc, and test data in separate files that they can read.

Rather than spending time learning TestNG, I think you may get more payback from tools such as Fitnesse, Robot or TextTest, which all allow you to get customers involved in reading and even writing tests. I think it could be a perfectly sensible choice to stick with JUnit for unit tests, and use one of these tools for both integration and system tests. What you choose will of course depends on the situation, for example the size of the system, the nature of the test data, and how many tools your team is willing to learn.

I just wrote a report about europython on my company blog.

At the XP2007 conference, Geoff and I presented a workshop entitled “The coder’s dojo: Acceptance Test Driven Development in python”. (Geoff also presented the same workshop at agile2007). We had three aims with this workshop, the first was to use the meeting format of a coders dojo, the second was to do some coding in python and the third was to demonstrate how you can do Acceptance TestDriven Development using TextTest. We felt the workshop went well, we had around 30 participants and we were able to do a little of everything we had set out to do.

Perhaps the most important thing was what we learnt from the experience. The workshop participants gave us some very useful feedback. One thing people said was that there were too many new ideas presented to expect as much audience participation as we did, and that instead of trying to do aRandori style kata, we should have done it as a Prepared kata. There also seemed to be a view that the Kata we had chosen (KataTexasHoldEm) was quite a difficult one. Another very valuable piece of feedback was that we were doing Test Driven Development (TDD) with much larger steps than people were used to.

What I have done is to create a screencast in an attempt to address this feedback, and open up our workshop material to a larger audience. In it, I perform a Prepared Kata of KataMinesweeper, doing TDD with TextTest and python. Geoff and I have been developing this approach to testing for a few years now, and we think it deserves consideration by the wider agile testing community. There are important advantages and disadvantages compared to classic TDD with xUnit.

I will write an article about this approach and provide links to the screencast in my next post on this blog.