Posts tagged ‘ATDD’

In my last post, I started talking about London School TDD, and the two features of it that I think distinguish it from Classic TDD. The first was Outside-In development with Double Loop TDD, which I’d like to talk more about in this post. The second was “Tell, Don’t Ask” Object Oriented Design. I’ll take that topic up in my next post.

Double Loop TDD

london_school_001

When you’re doing double loop TDD, you go around the inner loop on the timescale of minutes, and the outer loop on the timescale of hours to days. The outer loop tests are written from the perspective of the user of the system. They generally cover thick slices of functionality, deployed in a realistic environment, or something close to it. In my book I’ve called this kind of test a “Guiding Test”, but Freeman & Pryce call them “Acceptance Tests”. These tests should fail if something the customer cares about stops working – in other words they provide good regression protection. They also help document what the system does. (See also my article “Principles for Agile Automated Test Design“).

I don’t think Double Loop TDD is unique to the London School of TDD, I think Classic TDDers do it too. The idea is right there in Kent Beck’s first book about eXtreme Programming. What I think is different in London School, is designing Outside-In, and the use of mocks to enable this.

Designing Outside-In

If you’re doing double loop TDD, you’ll begin with a Guiding Test that expresses something about how a user wants to interact with your system. That test helps you identify the top level function or class that is the entry point to the desired functionality, that will be called first. Often it’s a widget in a GUI, a link on a webpage, or a command line flag.

With London School TDD, you’ll often start your inner loop TDD by designing the class or method that gets called by that widget in the GUI, that link on the webpage, or that command line flag. You should quickly discover that this new piece of code can’t implement the whole function by itself, but will need collaborating classes to get stuff done.

london_school_003

The user looks at the system, and wants some functionality. This implies a new class is needed at the boundary of the system. This class in turn needs collaborating classes that don’t yet exist.

The collaborating classes don’t exist yet, or at least don’t provide all the functionality you need. Instead of going away and developing these collaborating classes straight away, you can just replace them with mocks in your test. It’s very cheap to change mocks and experiment until you get the the interface and the protocol just the way you want it. While you’re designing a test case, you’re also designing the production code.

london_school_004

You replace collaborating objects with mocks so you can design the interface and protocol between them.

When you’re happy with your design, and your test passes, you can move down the stack and start working on developing the implementation of one of the collaborating classes. Of course, if this class in turn has other collaborators, you can replace them with mocks and design these interactions too. This approach continues all the way through the system, moving through architectural layers and levels of abstraction.

london_school_005

You’ve designed the class at the boundary of the system, and now you design one of the collaborating classes, replacing its collaborators with mocks.

This way of working lets you break a problem down into manageable pieces, and get each part specified and tested before you move onto the next part. You start with a focus on what the user needs, and build the system from the “outside-in”, following the user interaction through all the parts of the system until the guiding test passes. The Guiding Test will not usually replace parts of the system with mocks, so when it passes you should be confident you’ve remembered to actually implement all the needed collaborating classes.

Outside-In with Classic TDD

A Classic TDD approach may work outside-in too, but using an approach largely without mocks. There are various strategies to cope with the fact that collaborators don’t exist yet. One is to start with the degenerate case, where nothing much actually happens from the user’s point of view. It’s some kind of edge case where the output is much simpler than in the normal or happy-path case. It lets you build up the structure of the classes and methods needed for a simple version of the functionality, but with basically empty implementations, or simple faked return values. Once the whole structure is there, you flesh it out, perhaps working inside-out.

Another way to do this in Classic TDD is to start writing the tests from the outside-in, but when you discover you need a collaborating class to be implemented before the test will pass, comment out that test and move down to work on the collaborator instead. Eventually you find something you can implement with collaborators that already exist, then work your way up again.

A Classic TDD approach will often just not work outside-in at all. You start with one of the classes nearer the heart of the system. You’ll pick something that can be fully implemented and tested using collaborating classes that already exist.  Often it’s a class in the central domain model of the application. When that is done, you continue to develop the system from the heart towards the outside, adding new classes that build on one another. Because you’re using classes that already exist, there is little need for using mocks. Eventually you find you’ve built all the functionality needed to get the Guiding Test to pass.

Pros and Cons

I think there’s a definite advantage to working outside-in, it keeps your focus on what the user really needs, and helps you to build something useful, without too much gold-plating. I think it takes skill and discipline to work this way with either Classic or London School. It’s not easy to see how to break down a piece of functionality into incremental pieces that you can develop and design step-by-step. If you work from the heart outwards, there is a danger you’ll build more than you need for what the user wants, or that you’ll get to the outside, discover it doesn’t “fit”, and have to refactor your work.

Assuming you are working outside-in, though, one difference seems to me to be in whether you write faked implementations in the actual production code, or in mocks. If you start with fakes in the production code, you’ll gradually replace them with real functionality. If you put all the faked functionality into mocks, they’ll live with the test code, and remain there when the real functionality is implemented. This could be useful for documentation, and will make your tests continue to execute really fast.

Having said that, there some debate about the maintainability of tests that use a lot of mocks. When the design changes, it can be prohibitive to update all the mocks as well as the production code. Once the real implementations are done, maybe the inner-loop tests should just be deleted? The Guiding Test could provide all the regression protection you need, and maybe the tests that helped you with your original design aren’t useful to keep? I don’t think it’s clear-cut actually. From talking to London School proponents, they don’t seem to delete all the tests that use mocks. They do delete some though.

I’m still trying to understand these issues and work out in what contexts London School TDD brings the most advantage. I hope I’ve outlined what I see as the differences in way of working with outside-in development. In my next post I look at how London School TDD promotes “Tell, Don’t Ask” Object Oriented Design.

I’ve previously written about Agile test automation principles, and since then I’ve had some interesting discussions with people that have led me to revise them in this article. In particular, Seb Rose wrote about his 6 principles of unit testing and pointed out some issues with mine. So this article is an update on the previous one, and I’m hoping this will spark further interesting discussions!

I feel like I’ve spent most of my career learning how to write good automated tests in an agile environment. When I downloaded JUnit in the year 2000 it didn’t take long before I was hooked – unit tests for everything in sight. That gratifying green bar is near-instant feedback that everthing is as expected, my code does what I intended, and I can continue developing from a firm foundation.

Later, starting in about 2002, I began writing larger granularity tests, for whole subsystems; functional tests if you like. The feedback that my code does what I intended, and that it has working functionality has given me confidence time and again to release updated versions to end-users.

I was not the first to discover that developers design automated functional tests for two main purposes. Initially we design them to help clarify our understanding of what to build. In fact, at that point they’re not really tests, we usually call them scenarios, or examples. Later, the main purpose of the tests becomes to detect regression errors, although we continue use them to document what the system does.

When you’re designing a functional test suite, you’re trying to support both aims, and sometimes you have to make tradeoffs between them. You’re also trying to keep the cost of writing and maintaining the tests as low as possible, and as with most software, it’s the maintenance cost that dominates. Over the years I’ve begun to think in terms of four principles that help me to design functional test suites that make good tradeoffs and identify when a particular test case is fit for purpose.

Book-128Readability

When you look at the test case, you can read it through and understand what the test is for. You can see what the expected behaviour is, and what aspects of it are covered by the test. When the test fails, you can quickly see what is broken.

If your test case is not readable, it will not be useful, neither for understanding what the system does, or identifying regression errors. When it fails you will have to dig though other sources outside of the test case to find out what is wrong. Quite likely you will not understand what is wrong and you will rewrite the test to check for something else, or simply delete it.

internet_128Robustness

When a test fails, it means there is a regression error, (functionality is broken), or the system has changed and the tests no longer document the correct behaviour. You need to take action to correct the system or update the test, and this is as it should be. If however, the test has failed for no good reason, you have a problem: a fragile test.

There are many causes of fragile tests. For example tests that are not isolated from one another, duplication between test cases, and dependencies on random or threaded code. If you run a test by itself and it passes, but fails in a suite together with other tests, then you have an isolation problem. If you have one broken feature and it causes a large number of test failures, you have duplication between test cases. If you have a test that fails in one test run, then passes in the next when nothing changed, you have a flickering test.

If your tests often fail for no good reason, you will start to ignore them. Quite likely there will be real failures hiding amongst all the false ones, and the danger is you will not see them.

Speed-128Speed

As an agile developer you run your test suite frequently. Both (a) every time you build the system, (b) before you check in changes, and (c) after check-in in an automated Continuous Integration environment. I recommend time limits of 2 minutes for (a), 10 minutes for (b), and 60 minutes for (c). This fast feedback gives you the best chance of actually being willing to run the tests, and to find defects when they’re cheapest to fix, soon after insertion.

If your test suite is slow, it will not be used. When you’re feeling stressed, you’ll skip running them, and problem code will enter the system. In the worst case the test suite will never become green. You’ll fix the one or two problems in a given run and kick off a new test run, but in the meantime you’ll continue developing and making other changes. The diagnose-and-fix loop gets longer and the tests become less likely to ever all pass at the same time. This can become pretty demoralizing.

updatability.001Updatability

When the needs of the users change, and the system is updated, your tests also need to be updated in tandem. It should be straightforward to identify which tests are affected by a given change, and quick to update them all.

If your tests are not easy to update, they will likely get left behind as the system moves on. Faced with a small change that causes thousands of failures and hours of work to update them all, you’ll likely delete most of the tests.

Following these four principles implies Maintainability

Taken all together, I think how well your tests adhere to these principles will determine how maintainable they are, or in other words, how much they will cost. That cost needs to be in proportion to the benefits you get: helping you understand what the system does, and regression protection.

As your test suite grows, it becomes ever more challenging to adhere to all the principles. Readability suffers when there are so many test cases you can’t see the forest for the trees. The more details of your system that you cover with tests, the more likely you are to have Robustness problems – tests that fail when these details change.  Speed obviously also suffers – the time to run the test suite usually scales linearly with the number of test cases. Updatability doesn’t necessarily get worse as the number of test cases increases, but it will if you don’t adhere to good design principles in your test code, or lack tools for bulk update of test data for example.

I think the principles are largely the same whether you’re writing skinny little unit tests or fatter functional tests that touch more of the codebase. My experience tells me that it’s a lot easier to be successful with unit tests. As the testing thickness increases, the feedback cycle gets slower, and your mistakes are amplified. That’s why I concentrate on teaching these principles through unit testing exercises. Once you understand what you’re aiming for, you can transfer your skills to functional tests.

How can you use these principles?

I find it useful to remember these principles when designing test cases. I may need to make tradeoffs between them, and it helps just to step back and assess how I’m doing on each principle from time to time as I develop. If I’m reviewing someone else’s test cases, I can point to code and say which principles it’s not following, and give them concrete advice about how to make improvements. We can have a discussion for example about whether to add more test cases in order to improve regression protection, and how to do that without reducing overall readability.

I also find these principles useful when I’m trying to diagnose why a test suite is not being useful to a development team, especially if things have got so bad they have stopped maintaining it. I can often identify which principle(s) the team has missed, and advise how to refactor the test suite to compensate.

For example, if the problem is lack of Speed you have some options and tradeoffs to make:

  • Replace some of the thicker, slower end-to-end tests with lots of skinny fast unit tests, (may reduce regression protection)
  • Invest in hardware and run tests in parallel (costs $)
  • Use a profiler to optimize the tests for speed the same as you would production code (may affect Readability)
  • Use more fakes to replace slow parts of the system (may reduce regression protection)
  • Identify key test cases for essential functionality and remove the other test cases. (sacrifice regression protection to get Speed)

Strategic Decisions

The principles also help me when I’m discussing automated testing strategy, and choosing testing tools. Some tools have better support for updating test cases and test data. Some allow very Readable test cases. It’s worth noting that automated tests in agile are quite different from in a traditional process, since they are run continually throughout the process, not just at the end. I’ve found many traditional automation tools don’t lead to enough Speed and Robustness to support agile development.

I hope you will find these principles help you to reason about your strategy and tools for functional automated testing, and to design more maintainable, useful test cases.

Images Attribution: DaPino Webdesign, Lebreton, Asher Abbasi, Woothemes, Iconshock, Andy Gongea, FatCow from www.iconspedia.com

Please note – As of March 2013, I have rewritten this post in the light of further experience and discussions. The updated post is available here.

I feel like I’ve spent most of my career learning how to write good automated tests in an agile environment. When I downloaded JUnit in the year 2000 it didn’t take long before I was hooked – unit tests for everything in sight. That gratifying green bar is near-instant feedback that everthing is as expected, my code does what I intended, and I can continue developing from a firm foundation.

Later, starting in about 2002, I began writing larger granularity tests, for whole subsystems; functional tests if you like. The feedback that my code does what I intended, and that it has working functionality has given me confidence time and again to release updated versions to end-users.

Often, I’ve written functional tests as regression tests, after the functionality is supposed to work. In other situations, I’ve been able to write these kinds of tests in advance, as part of an ATDD, or BDD process. In either case, I’ve found the regression tests you end up with need to have certain properties if they’re going to be useful in an agile environment moving forward. I think the same properties are needed for good agile functional tests as for good unit tests, but it’s much harder. Your mistakes are amplified as the scope of the test increases.

I’d like to outline four principles of agile test automation that I’ve derived from my experience.

Coverage

If you have a test for a feature, and there is a bug in that feature, the test should fail. Note I’m talking about coverage of functionality, not code coverage, although these concepts are related. If your code coverage is poor, your functionality coverage is likely also to be poor.

If your tests have poor coverage, they will continue to pass even when your system is broken and functionality unusable. This can happen if you have missed out needed test cases, or when your test cases don’t check properly what the system actually did. The consequences of poor coverage is that you can’t refactor with confidence, and need to do additional (manual) testing before release.

The aim for automated regression tests is good Coverage: If you break something important and no tests fail, your test coverage is not good enough. All the other principles are in tension with this one – improving Coverage will often impair the others.

Readability

When you look at the test case, you can read it through and understand what the test is for. You can see what the expected behaviour is, and what aspects of it are covered by the test. When the test fails, you can quickly see what is broken.

If your test case is not readable, it will not be useful. When it fails you will have to dig though other sources outside of the test case to find out what is wrong. Quite likely you will not understand what is wrong and you will rewrite the test to check for something else, or simply delete it.

As you improve Coverage, you will likely add more and more test cases. Each one may be fairly readable on its own, but taken all together it can become hard to navigate and get an overview.

Robustness

When a test fails, it means the functionality it tests is broken, or at least is behaving significantly differently from before. You need to take action to correct the system or update the test to account for the new behaviour. Fragile tests are the opposite of Robust: they fail often for no good reason.

Aspects of Robustness you often run into are tests that are not isolated from one another, duplication between test cases, and flickering tests. If you run a test by itself and it passes, but fails in a suite together with other tests, then you have an isolation problem. If you have one broken feature and it causes a large number of test failures, you have duplication between test cases. If you have a test that fails in one test run, then passes in the next when nothing changed, you have a flickering test.

If your tests often fail for no good reason, you will start to ignore them. Quite likely there will be real failures hiding amongst all the false ones, and the danger is you will not see them.

As you improve Coverage you’ll want to add more checks for details of your system. This will give your tests more and more reasons to fail.

Speed

As an agile developer you run the tests frequently. Both (a) every time you build the system, and (b) before you check in changes. I recommend time limits of 2 minutes for (a) and 10 minutes for (b). This fast feedback gives you the best chance of actually being willing to run the tests, and to find defects when they’re cheapest to fix.

If your test suite is slow, it will not be used. When you’re feeling stressed, you’ll skip running them, and problem code will enter the system. In the worst case the test suite will never become green. You’ll fix the one or two problems in a given run and kick off a new test run, but in the meantime someone else has checked in other changes, and the new run is not green either. You’re developing all the while the tests are running, and they never quite catch up. This can become pretty demoralizing.

As you improve Coverage, you add more test cases, and this will naturally increase the execution time for the whole test suite.

How are these principles useful?

I find it useful to remember these principles when designing test cases. I may need to make tradeoffs between them, and it helps just to step back and assess how I’m doing on each principle from time to time as I develop.

I also find these principles useful when I’m trying to diagnose why a test suite is not being useful to a development team, especially if things have got so bad they have stopped maintaining it. I can often identify which principle(s) the team has missed, and advise how to refactor the test suite to compensate.

For example, if the problem is lack of Speed you have some options and tradeoffs to make:

  • Invest in hardware and run tests in parallel (costs $)
  • Use a profiler to optimize the tests for speed the same as you would production code (may affect Readability)
  • push down tests to a lower level of granularity where they can execute faster. (may reduce Coverage and/or increase Readability)
  • Identify key test cases for essential functionality and remove the other test cases. (sacrifice Coverage to get Speed)

Explaining these principles can promote useful discussions with people new to agile, particularly testers. The test suite is a resource used by many agile teamembers – developers, analysts, managers etc, in its role as “Living Documentation” for the system, (See Gojko Adzic‘s writings on this). This emphasizes the need for both Readability and Coverage. Automated tests in agile are quite different from in a traditional process, since they are run continually throughout the process, not just at the end. I’ve found many traditional automation approaches don’t lead to enough Speed and Robustness to support agile development.

I hope you will find these principles will help you to reason about the automated tests in your suite.

Programmers have a vested interest in making sure the software they create does what they think it does. When I’m coding I prefer to work in the context of feedback from automated tests, that help me to keep track of what works and how far I’ve got. I’ve written before about Test Driven Development, (TDD). In this article I’d like to explain some of the main features of Text-Based Testing. It’s a variant on TDD, perhaps more suited to the functional level than unit tests, and which I’ve found powerful and productive to use.


The basic idea
You get your program to produce a plain text file that documents all the important things that it does. A log, if you will. You run the program and store this text as a “golden copy” of the output. You create from this a Text-Based Test with a descriptive name, any inputs you gave to the program, and the golden copy of the textual output.

You make some changes to your program, and you run it again, gathering the new text produced. You compare the text with the golden copy, and if they are identical, the test passes. If the there is a difference, the test fails. If you look at the diff, and you like the new text better than the old text, you update your golden copy, and the test is passing once again.

Tool Support
Text-Based Testing is a simple idea, and in fact many people do it already in their unit tests. AssertEquals(String expected, String actual) is actually a form of it. You often create the “expected” string based on the actual output of the program, (although purists will write the whole assert before they execute the test).

Most unit test tools these days give you a nice diff even on multi-line strings. For example:

download
download (1)

Which is a failing text-based test using JUnit.

Once your strings get very long, to the scale of whole log files, even multi-line diffs aren’t really enough. You get datestamps, process ids and other stuff that changes every run, hashmaps with indeterminate order, etc. It gets tedious to deal with all this on a test-by-test basis.

My husband, Geoff Bache, has created a tool called “TextTest” to support Text-Based testing. Amongst other things, it helps you organize and run your text-based tests, and filter the text before you compare it. It’s free, open source, and of course used to test itself. (Eats own dog food!) TextTest is used extensively within Jeppesen Systems, (Geoff works for them, and they support development), and I’ve used it too on various projects in other organizations.

In the rest of this article I’ll look at some of the main implications of using a Text-Based Testing approach, and some of my experiences.

Little code per test
The biggest advantage of the approach, is that you tend to write very little unique code for each test. You generally access the application through a public interface as a user would, often a command line interface or (web)service call. You then create many tests by for example varying the command line options or request contents. This reduces test maintenance work, since you have less test code to worry about, and the public API of your program should change relatively infrequently.

Legacy code
Text-Based Testing is obviously a regression testing technique. You’re checking the code still does what it did before, by checking the log is the same. So these tests are perfect for refactoring. As you move around the code, the log statements move too, and your tests stay green, (so long as you don’t make any mistakes!) In most systems, it’s cheap and risk-free to add log statements, no matter how horribly gnarly the design is. So text-based testing is an easy way to get some initial tests in place to lean on while refactoring. I’ve used it this way fairly successfully to get legacy code under control, particularly if the code already produces a meaningful log or textual output.


No help with your design
I just told you how good Text-Based Testing is with Legacy code. But actually these tests give you very little help with the internal design of your program. With normal TDD, the activity of creating unit tests at least forces you to decompose your design into units, and if you do it well, you’ll find these tests giving you all sorts of feedback about your design. Text-Based tests don’t. Log statements don’t care if they’re in the middle of a long horrible method or if they’re spread around several smaller ones. So you have to get feedback on your design some other way.

I usually work with TDD at the unit level in combination with Text-Based tests at the functional level. I think it gives me the best of both worlds.

Log statements and readability
Some people complain that log statements reduce the readability of their code and don’t like to add any at all. They seem to be out of fashion, just like comments. The idea is that all the important ideas should be expressed in the class and method names, and logs and comments just clutter up the important stuff. I agree to an extent, you can definitely over-use logs and comments. I think a few well placed ones can make all the difference though. For Text-Based Testing purposes, you don’t want a log that is megabytes and megabytes of junk, listing every time you enter and leave every method, and the values of every variable. That’s going to seriously hinder your refactoring, apart from being a nightmare to store and update.

What we’re talking about here is targeted log statements at the points when something important happens, that we want to make sure should continue happening. You can think about it like the asserts you make in unit tests. You don’t assert everything, just what’s important. In my experience less than two percent of the lines of code end up being log statements, and if anything, they increase readability.

Text-Based tests are completed after the code
In normal TDD you write the test first, and thereby set up a mini pull system for the functionality you need. It’s lean, it forces you to focus on the problem you’re trying to solve before you solve it, and starts giving you feedback before you commit to an implementation. With Text-Based Testing, you often find it’s too much work the specify the log up front. It’s much easier to wait until you’ve implemented the feature, run the test, and save the log afterwards.

So your tests usually aren’t completed until after the code they test, unlike in normal TDD. Having said that, I would argue that you can still do a form of TDD with Text-Based Tests. I’d normally create the half the test before the code. I name the test, and find suitable inputs that should provoke the behaviour I need to implement in the system. The test will fail the first time I run it. In this way I think I get many of the benefits of TDD, but only actually pin down the exact assertion once the functionality is working.

“Expert Reads Output” Antipattern
If you’re relying on a diff in the logs to tell you when your program is broken, you had better have good logs! But who decides what to log? Who checks the “golden copy”? Usually it is the person creating the test, who should look through the log and check everything is in order the first time. Of course, after a test is created, every time it fails you have to make a decision whether to update the golden copy of the log. You might make a mistake. There’s a well known antipattern called “Expert Reads Output” which basically says that you shouldn’t rely on having someone check the results of your tests by eye.

This is actually a problem with any automated testing approach – someone has to make a judgement about what to do when a test fails – whether the test is wrong or there’s a bug in the application. With Text-Based Testing you might have a larger quantity of text to read through compared with other approaches, or maybe not. If you have human-readable, concise, targeted log statements and good tools for working with them, it goes a long way. You need a good diff tool, version control, and some way of grouping similar changes. It’s also useful to have some sanity checks. For example TextTest can easily search for regular expressions in the log and warn you if you try to save a golden copy containing a stack trace for example.

In my experience, you do need to update the golden copy quite often. I think this is one of the key skills with a Text-Based Testing approach. You have to learn to write good logs, and to be disciplined about either doing refactoring or adding functionality, not both at the same time. If you’re refactoring and the logs change, you need to be able to quickly recognize if it’s ok, or if you made a mistake. Similarly, if you add new functionality and no logs change, that could be a problem.

Agile Tests Manage Behaviour
When you create a unit test, you end with an Assert statement. This is supposed to be some kind of universal truth that should always be valid, or else there is a big problem. Particularly for functional level tests, it can be hard to find these kinds of invariants. What is correct today might be updated next week when the market moves or the product owner changes their mind. With Text-Based Testing you have an opportunity to quickly and easily update the golden copy every time the test “fails”. This makes your tests much more about keeping control of what your app does over time, and less about rewriting assert statements.

Text-Based Testing grew up in the domain of optimizing logistics planning. In this domain there is no “correct” answer you can predict in advance and assert. Planning problems that are interesting to solve are far too complex for a complete mathematical analysis, and the code relies on heuristics and fancy algorithms to come up with better and better solutions. So Text-Based Testing makes it easy to spot when the test produces a different plan from before, and use it as the new baseline if it’s an improvement.

I think generally it leads to more “agile” tests. They can easily respond to changes in the business requirements.

Conclusions
There is undoubtedly a lot more to be said about Text-Based Testing. I havn’t mentioned text-based mocking, data-driven vs workflow testing, or how to handle databases and GUIs – all relevant topics. I hope this article has given you a flavour of how it’s different from ordinary TDD, though. I’ve found that good tool support is pretty essential to making Text-Based Testing work well, and that it’s a particularly good technique for handling legacy code, although not exclusively. I like the approach because it minimizes the amount of code per test, and makes it easy to keep the tests in sync with the current behaviour of the system.

 

Yesterday, Geoff gave his talk about agile GUI testing. For anyone who missed it, here is a video of him giving roughly the same talk earlier this year at Europython. Gojko Adzic has also blogged about what he said, which is exactly the kind of feedback we came to this conference for. In his post Gojko explains Geoff’s testing approach, and seems quietly positive about it, at least compared to the “sine of death” you get with other UI test automation tools. His conclusion that it looked more suitable for legacy code than greenfield development is a little uncomfortable though. We think it works there too 🙂

 

Today I’m giving my talk about teaching Test Driven Development via Coding Dojos. I’m looking forward to some feedback from the community about that. In the meantime I’ve written a bit about the three keynote talks we listened to yesterday.

 

Lisa Crispin

The day started with a keynote from Lisa Crispin titled “Agile Defect Management”. The overall message was to “lower the bar!” and aim to reduce defect count as far as possible, even to the point where a defect tracking software becomes superfluous. There was a lot of talk about whether such a tool was needed, and in what situations, and she gave a good overview of the state of the art of thinking in this area.

 

This was a good talk, with audience involvement, by someone who knows what they are talking about. The thing is I have higher expectations from a keynote. I expect to come away inspired and challenged, with some new insight to take back to my daily life. For me, this talk didn’t really deliver that. Lisa concentrated too much on the specific question of whether to use an electronic defect tracking tool or not, and didn’t sufficiently put that question in to the wider context given in the title of the talk “agile defect management”. I was disappointed to find nothing really original or surprising in what Lisa said.

 

Linda Rising

This was a good talk to hold straight after lunch when everyone is a bit sleepy. Linda spoke very amusingly on the subject “Deception and Estimation: How we Fool Ourselves”. She began by inviting us to see this as “the weird talk” of the conference, and that she was going to go through some of the latest scientific research in the area of cognition and psychology and hoped to relate that to why we have such trouble making good estimates in the context of a software project.

 

A self-confessed scientific amateur, she stated up front that she wouldn’t provide references to the research she mentioned, although she could give them to you if you emailed her and asked. Linda then proceeded to relate a series of amusing anecdotes designed to illustrate how irrational and over-optimistic people can be. (Markus Gärtner has blogged about what they all were). Towards the end of the talk, Linda began to relate all this to the subject of estimation, and told a story about some conference where she met people who were trying to apply scientific methods, statistics and data mining to the problem of improving estimates in software projects. In my mind, a seemingly a rational response to the problem of irrational, over-optimistic people.

 

Linda then did what I saw as a complete about-turn in her argument. She quoted one proponent of this “scientific” approach as saying, “well we can’t just make up a number, can we?”. Well no, we can’t, Linda just spent the last half hour convincing us humans are over-optimistic and irrational, and you can’t trust them to make up numbers. Yet that seemed to be exactly what Linda then proposed we do in agile. The points about the way agile overcomes this natural human over-optimism by for example breaking down problems into thin slices, using the wisdom of the crowd and enforcing tight feedback cycles, all kind of got crammed in at the end with little or no explaination.

 

Quite apart from those specific criticisms of her argument, as someone with a scientific training I didn’t like the way Linda protrayed science and scientists. They initially appeared in Linda’s talk as white-coated oracles who make pronouncements of the truth. “80% of drivers think they are above average”. “You eat more at an all-you-eat buffet”. “Online daters lie about their age and weight”. She then attempted to shatter this illusion of scientific infalliability by quoting Planck, who pointed out that the scientific process doesn’t always proceed in an orderly manner, and sometimes new and better theories only really catch on when the older generation who invented the previous ones atually die off.

 

Yes, scientists are human too, and you do them a disservice when you only present their results as received truths, without references, and without explaining either the methods they used to reach the conclusions, or what they themselves think of the wider applicability of the results.

 

This could have been a far more interesting talk about actual recent research studies – what’s coming out of the latest brain imaging techniques, for exaple. Linda could also have spent more time explaining how agile works with human nature to provide better estimates and plans. For all that it made me laugh, all this talk left me with was a bunch of amusing anecdotes and an uneasy feeling that agile was anti-science.

 

Elisabeth Hendrickson

The last session of the day saw Elisabeth Hendrickson presenting “Lessons Learned from 100+ Simulated Agile Transitions”. With a huge amount of energy and panache, Elisabeth strode around the stage, explaining what happens in an exercise which she usually does with a group of 8-20 people over the course of a whole day. Within the framework of this simulation, she drew out stories and anecdotes to illustrate such diverse subjects as the Satir change model, how physical layout affects communication, the difference between status meetings and communication meetings, and tests as alignment tools.

 

This talk was definitely the highlight of my day. Elisabeth took some things I kind of knew about, and made me think about them in a different light, from a different angle, and in a new context. I was challenged to go back to my standup meetings and make sure they really are about communication, and that my task board really does make status visible. I have a additional way of explaining ATDD to people – in terms of aligning developers and other stakeholders and getting them moving in the same direction.

 

Having said all that, I do have a criticism (are you surprised?!). Elisabeth released her slides under the creative commons license, but she does not release the details of her simulation. This would effectively prevent anyone else from running it. I think this is rather like a tool vendor who presents a new testing approach, which by the way, you can’t use without either buying their tool, or spending a large amount of time and money developing your own version. Those kinds of talks don’t tend to get accepted at this kind of conference.

 

I was disappointed that Elisabeth didn’t release her simulation materials, and I’m not sure why she doesn’t want to. She is obviously a fantastic agile coach and facilitator, and has more invitations to speak than she has time or inclination to accept. It would surely only enhance her reputation to make the agile transition simulation game materials available.

 

Update: I talked to Elisabeth afterwards about her materials, and she related a story about another independent consultant she knows, who arrived at a client site ready to do a simulation exercise that he had designed, only to discover the participants had done the exact same exercise the week before with a different consultant! The other consultant had just taken the material without permission or acknowledgement. Elisabeth doesn’t want to end up in the same situation, and I can understand that. She did say she could release more information about the simulation though, enough that you could understand how it is designed, and perhaps build your own similar one. I think that would be a reasonable compromise. I just felt slightly cheated after her keynote – I wanted to look at this simulation, poke at it, see how it works and understand why she could use it to generate so many great insights into agile transitions.