Archive for the ‘Uncategorized’ Category

A write-up on learning away from work

Note: this post originally appeared on Praqma’s blog

Do you work on any hobby coding projects in your free time? Practice code katas? We all wish we could, but making time for learning away from work isn’t possible for everyone. So, who should pay for learning time?

Recently, I took part in a panel discussion at the Software Craftsmanship Conference in London. One of the questions from the audience was about finding time to learn new skills. Bob Martin, sitting to my right in the picture below, talked about the importance of spending your free time on reading, practicing code katas, and generally sharpening your skills. He said that we need to be able to call upon those skills when there’s a deadline and production software needs to be built. He also said that it’s unrealistic to think these skills can be acquired on the job. That practice needs to be done beforehand, at home, on your own time.

Picture from@cyriux

I agree strongly that software developers need to learn throughout their career. As professional knowledge workers it’s crucial for us to keep up to date and continually expand our toolboxes. Skills like Test-Driven Development (TDD) take time and practice before you become effective with them. Our industry moves very fast and new tools and techniques come along frequently.

I also agree that you need to practice in safe situations before applying a technique in production code. When I see badly designed code it’s often due to developers not fully understanding the libraries and frameworks they use. This causes them to skip testing or do it badly. What’s more, they often also lack techniques for refactoring and can’t improve the design once they’ve learnt more.

The problem is only getting bigger

Software is taking an ever larger role in our society. Bob Martin pointed out in his keynote speech that there is very little you can do in this life without interacting with computers and the software they run. You can’t buy anything, watch tv, do your washing, or probably arrange to meet your friends and go out for a drink without some form of interaction with information technology. Software is everywhere and skilled software developers are needed now in unprecedented numbers.

Not everyone has free time for this

I agree with Bob Martin about the problem: developers need to improve their skills. However, I just don’t think it’s realistic to expect people to spend their free time on this. At least, not in any sustained way and over a whole career. I actually think it’s pretty unhelpful to suggest this should be the main way skill acquisition happens. Several people at the conference agreed with me – not people in the panel, I should say – but regular conference attendees. At least half a dozen people approached me the following day and thanked me for saying this.

At times in my career I have had the luxury of enough free time to spend some of it on learning, reading, practicing and improving my software development skills. But at other times I’ve had family commitments, caring responsibilities, and other priorities that have completely prevented this. Expecting people to learn skills like TDD solely in their free time will cut off a large section of software developers from progressing and becoming more productive. We need those developers too!

Let’s look at better solutions.

One consequence of the current shortage of skilled software developers is our ability to vote with our feet. If your employer does not provide paid on-the-job time for learning, you can probably find another one that will. I am very happy to work for Praqma where our policy is that everyone should spend a good proportion of their time studying, going to conferences, reading, giving internal seminars and generally keeping themselves updated on what’s going on.

Giving developers time to learn makes good business sense in a competitive world where attracting and retaining skilled people is an advantage in the marketplace. You have the power to choose your employer. I don’t think Praqma is the only company that has realized providing time for learning is attractive to smart people and actually a win/win proposition. There may be a similar company close to you too. Have a look around!

What if I can’t change jobs?

Just as having free time for learning can be an unattainable luxury, there may be times in a career where changing jobs is not an option either. In this case you may need to talk persuasively with your manager and affect change from within. I really believe that a developer who improves their skills will work more effectively and be a bigger asset to their team than one who works only on production code and never does any practice. I think you should be able to persuade your boss that a small investment in training time now will reap dividends later.

I’ve also seen people spending a lot of time at work on code that never makes it to production. Have you seen that? Can you bring your boss examples of time you’ve wasted on proof-of-concept projects that never got to production, or long-lived feature branches that were impossible to merge and had to be thrown away? Perhaps you could spend some of that time practicing instead, and avoid some of those costly mistakes.

We need industry-wide skill acquisition

I think Bob Martin is right about the problem but wrong about the solution. When asked how an individual can improve their skills the answer can’t rely on them using their free time. It’s elitist, only those with that luxury of having free time will be able to do it. We need training and skills acquisition on a broad front that will work for the majority of developers. I think employers must put up some time for this as part of the working week. As individual developers we can argue for this, and vote with our feet if necessary. The learning of new skills shouldn’t be a luxury that’s only available to those with free time on their hands. Software, and its role in our world, is too important to rely on such a tenuous and poorly distributed resource.

I wrote a blog post about these tools on my company blog. In swedish.

Bob Martin has just written a post in his blog where he tells the story of a test manager who has 80 000 manual tests, and wishes they were automated instead. Bob writes:

“One common strategy to get your tests automated is to outsource the problem. You hire some team of test writers to transform your manual tests into automated tests using some automation tool. These folks execute the manual test plan while setting up the automation tool to record their actions. Then the tool can simply play the actions back for each new release of the system; and make sure the screens don’t change.”

Bob then goes on to explain why this is such a terrible idea – and blames it all on coupling. That the tests and the GUI are coupled to the extent that when you change the GUI, loads of tests break. Wheras humans can handle a fair amount of GUI changes and still correctly determine whether a manual test should pass or fail, machines fall over all too easily and just fail as soon as something unexpected happens. So you end up re-recording them, which can cost as much as just doing the tests manually in the first place.

These problems are of course bigger or smaller depending on the GUI automation tool you choose. Anything that records pixel positions will fall over when you simply change the screen resolution, let alone when you add new buttons and features in your GUI. More modern tools record the names or ids of the widgets, so they don’t break if the widget simply moves to another part of the screen. In other words, you reduce your coupling.

Geoff has been working on PyUseCase which takes this to another level. Instead of coupling the tests to widget names, you couple them to “domain actions”. This makes your tests even more robust in the face of gui changes. A drop down list can turn into a set of radio buttons and your tests won’t mind, since they just say something like “select airport SFO”. This doesn’t isolate you from the big changes, like moving the order of the screens in a wizard around, but since the tests are written in plain text, in a language any domain expert can read, they are relatively cheap to update.

There is another respect in which machines under-perform compared to manual testers. An intelligent human will usually do a certain amount of exploration beyond the scripted test steps they have infront of them. They try to understand the purpose of the test, click around a bit and ask questions when parts of the system peripheral to the test in hand start to look odd. Machines don’t do any exploration, and in fact often don’t even notice errors on parts of the screen they havn’t been told to look at.

Geoff’s PyUseCase can partly address this kind of a problem. Used together with TextTest, it will continually scan the log the System Under Test produces, and fail the test for example if any stack traces appear. PyUseCase also automatically produces a low fidelity ascii-art-esque log of how the current screen looks, and can compare it against what it looked like last time the test ran. Changes are flagged as test failures, which will bring to your attention the change in an unrelated corner of the screen which says “32nd December” instead of “1st January”.

I know that sounds like we just introduced a huge amount of coupling between the tests and the way the GUI looks, and yes, we have. The difference is that this coupling is very easy to manage. If 1000 tests all fail saying “expected: 1st January, found: January 1st”, TextTest handily groups all the test failures and lets you accept or reject the change en-masse. So it is very little work to update a lot of tests when the GUI just looks different, but you don’t care.

There is still a problem though, that the machine will not explore outside of the scripted steps you tell it to perform. So you will have to do some manual exploratory testing too, not everything can be automated.

So a simplistic lets-just-automate-our-manual-tests is a bad idea because machines can’t handle GUI changes as well as humans can, and because machines don’t look around and explore. Potentially your automated tests will cost more than your manual tests, and find fewer bugs.

So should we stick with our manual test suite then? No, of course not. The value of automated tests is not simply that you can run them more cheaply than manual tests, it is that you can run them more often – at every build, constantly supplying developers with valuable feedback rather than just at the end of the release cycle. It is this kind of feedback that enables refactoring, and lets developers build quality code from the start. That is their real gain over manual tests.

Bob Martin’s suggestion is that you shouldn’t rely on expensive GUI tests for this kind of feedback – only perhaps 15% of your tests should be GUI reliant. The rest run against some kind of api, which is less volatile and hence cheaper to maintain. With the kinds of tools Bob I suspect has been using for GUI testing I’m not surprised he says this. I just think that with tools like PyUseCase and TextTest the costs are much reduced, and call for reconsideration of this ratio. Looking at Geoff’s self tests for TextTest (a GUI intensive tool), around half are testing through the GUI, using pyUseCase. Basically I don’t think GUI tests have to be as bad and expensive as Bob makes out.

Geoff has just put up a couple of new pages on the texttest website, with some coverage statistics for his self tests. He uses coverage.py to produce this report which shows all the python modules in texttest, and marks covered statements in green. I think it’s pretty impressive – he’s claiming over 98% statement coverage for the over 17 000 lines of python code in texttest.

I had a poke around looking for some numbers to compare this to, and found on this page someone claiming Fitnesse has 94% statement coverage from its unit tests, and the Java Spring framework has 75% coverage. It’s probably unwise to compare figures for different programming languages directly, but it gives you an idea.

Geoff also publishes the results of his nightly run of self tests here. It looks a bit complicated, but Geoff explained it to me. 🙂 He’s got nearly 2000 tests testing texttest on unix, and about 900 testing it on windows. As you can see, the tests don’t always pass, some are annoying ones that fail sporadically, some are due to actual bugs, which then get fixed. So even though he rarely has a totally green build, the project looks healthy overall, with new tests and fixes being added all the time.

Out of those 3000 odd tests that get run every night, Geoff has a core of about 1000 that he will run before every significant check-in. Since they run in parallel on a grid, they usually take about 2 minutes to execute. (When he has to run them at home in series on our fairly low spec linux laptop they take about half an hour.)

Note that we aren’t talking about unit tests here, these are high level acceptance tests, running the whole texttest system. About half of them use PyUseCase to simulate user actions in the texttest GUI, the rest interact with the command line interface. Many of the tests use automatically generated test doubles to simulate interaction with 3rd party systems like version control, grid engines, diff programs etc.

Pretty impressive, don’t you think? Well I’m impressed. But then I am married to him so I’m not entirely unbiased 🙂

I’ve been doing some work lately creating automated functional test suites using Selenium RC to simulate user interaction with a web GUI. I discovered quickly that the tests you record directly from selenium are rather brittle, and hard to read. In order to make the tests more robust and readable, I have been extracting reusable chunks of script that make sense from the user perspective, into separate methods. For example when testing a page for registering a new provider, you might have a ProviderPage domain class, with method “createNewProvider”. This method encapsulates all the selenium calls that interact with the page, and lets your test be written in terms of the domain.

I just saw this article from Patrick Wilson Welsh basically saying the same thing, only his DSL has three layers of indirection instead of just two. As well as encapsulating page operations in a Page class, he encapsulates operations on widgets within a page. I hadn’t thought of doing that. It makes the code in the Page class more readable. I might try that, and see if it improves my code.