I blogged a while back about “Text-Based testing”, which is a variant of Test-Driven Development that I’ve used quite a bit. My husband, Geoff Bache, is developing several tools to support this style of development.

Recently, we met Llewellyn Falco and discovered the work he’s been doing with Approval Tests. We were all really excited to realize we’ve independently been working on something very similar. Llewellyn’s Approval Tests library is in some ways a unit-test version of Geoff’s tool TextTest which is probably more suited to integration or functional tests. What we’ve been calling “Text-Based Testing” I think is better described as “Text-Based Approval Testing”. I think it’s a particularly powerful technique for characterization tests of legacy code, and regression testing in general. Geoff’s latest tools also make it a viable approach for GUI testing, traditionally an area where people have difficulty doing pure TDD. (We’ll be talking about this at Eurostar in November.

I’ve written a fuller description of Approval Testing in a chapter of my work-in-progress book “Mocks, Fakes and Stubs”, but I’ll summarize here. In classic Test-Driven Development, you begin by defining a test case comprising three parts – “Arrange”, “Act”, “Assert”. The assertion generally takes the form assertEqual(expected, actual), and you calculate the expected value when you define the test case. Then you go away and implement the functionality, until the “actual” value matches “expected”, and the test passes.

With Approval Testing, you design the test case to the point of having “Arrange” and “Act”, but defer defining the “expected” value for the “Assert”. You take the approach of “I’ll know it when I see it”, and get on with implementing the code. When the actual value the code produces looks right, you “Approve” it – store the actual value in the test case.  So the assert statement becomes “assertEqual(approved, actual)”.

Text-Based Approval Testing
The value you approve on could be anything you can automatically diff the actual program output against – a file, a string, a screenshot, some json, contents of a database table… you name it. The thing is, plain text is wonderfully simple to diff, version control, merge, store, manipulate… and there’s a wealth of existing, well understood tools to do that. I guess that’s why Geoff’s tools only support plain text so far. His approach has always been that if your program produces output in a different format, you write a test fixture to convert it to plain text before you diff. Llewellyn’s tools have branched out more into diffing images and suchlike.

I think “Approval Testing” is a good name for the style of testing both Geoff and Llewellyn’s tools support. I like the implication that you explicitly approve the output from your program as correct, and use that as the basis for your test.

Other Approval Testers

Geoff and Llewellyn aren’t the only people using an Approval testing approach, either. Recently I led a workshop where we compared writing tests for the Gilded Rose Kata using both Cucumber and Approval Tests. Nat Pryce was there, and he later blogged about it. He speculates that Approval testing might solve some problems he’s seen with other kinds of testing. Nat has subsequently started developing a new approval testing tool, Pearlfish, so he can test his ideas.

There is also this recent screencast by Brett Slatkin from Google  who explains how he’s using an approval testing technique with image diffs to regression test his webapp. He says he finds this technique essential in a continuous delivery environment – these tests find bugs his other tests (both manual and automatic) miss entirely.

I have also found Approval testing to be a really useful technique, and I hope that simply having a good name for it will help people understand what it is. Perhaps you’ll realize it’s an approach you’ve already used, just without having a name for it. Or maybe you’ll be inspired to try out one of the tools I’ve mentioned.

Last week I was in Oxford at “Iverson College”, which is a conference on the topic of Array Language Programming. There were about 25 programmers there, most of whom are expert in one or more of APL, J, K, or Q. It’s not my usual comfort zone, put it that way! I’m fairly competent with a number of programming languages, notably Python and Java, but nothing I know is really much like these array languages. It’s been a huge culture shock, but in a good way, I think.

My main discoveries are that Array Programming is different again from Object Oriented Programming and Functional Programming, (although it has a lot in common with functional programming), and that this community contains some exceptional programmers. The total number of array language programmers is however extremely small and their work seems to be pretty much unknown to the wider programming community.

Array Programming Languages
I mentioned before four languages, APL, J, K and Q. They are similar to each other, kind of like Ruby and Python are similar to each other. I’ve gone through an introductory training in each language this week, largely given by the language designers themselves. I’d like to relate a little of what I’ve discovered about them.

APL
This is the oldest of the array languages, invented by Ken Iverson in the 1960s. It’s notorious for using an alphabet of funny-looking symbols to represent the built-in functions. You can try it out at http://tryapl.org – an interactive REPL (Read-Evaluate-Print-Loop) where you can put in snippets of code and see what the symbols do.

I thought at first that APL looked really intimidating and unnecessarily weird. Now having got to know it a little, I can see the benefits to the little symbols. They make the code really concise and unambiguous, and it doesn’t take long to learn their names. Once you can pronounce each symbol in your head as you read the code, it’s not much different from writing out the names in full in the editor.

The variant of APL that most of the conference attendees use is produced by the company Dyalog. I first met the CTO, Morten Kromberg, at an XP conference in 2006. He’s shown me some APL before, but this time I really got a chance to sit down with him and look at how he writes code. Dyalog APL has a powerful IDE including a REPL, where Morten showed me how he plays around with data and code, in order to come up with some useful APL expressions. When he’d got something working, he transfers code from the REPL into a file, to make it re-usable and shareable. It’s a familiar way of working to me, many Python programmers code this way, flipping between the REPL and a script file. It was a real pleasure to code with Morten – he is an extremely skilled programmer. Dyalog APL looks nice too, it has a fully-fledged IDE, and interfaces with .Net, Excel spreadsheets, ASP.net and more. It would fit nicely into the technology stack of many IT departments basically.

J
This was Ken Iverson’s next language, created together with Roger Hui, who now continues development of it. J is similar to APL in many ways, but is open source, and uses only ASCII characters. They’ve made an effort to make it open and less intimidating to newcomers, and probably for that reason, it’s the one I chose to download and try to learn before the conference.

I met Roger at breakfast on the first day of the conference, knowing nothing about who he was, he just said he was a programmer. I confessed that I’d downloaded J and made some joke about hoping I’d get on better with it than Ron Jeffries. (Ron wrote articles in his blog, about his efforts to learn J, and later gave up, finding it too hard!). Roger genuinely didn’t know who Ron Jeffries is, although he did know of the agile manifesto. He was very kind and concerned to help me to understand J though, (and Ron, if he wants!)

Despite my head start with J, by the end of the conference I found APL code easier to grasp – J seems more extreme to me. Roger calls J “executable mathematical notation”, and I’ve always been a bit more of an engineer than a mathematician.

K
K was invented by Arthur Whitney, who was also at the conference. I didn’t really get a feel for how the language works, more than that it’s extremely terse.

Arthur gave a talk at the conference, about his new project, KOS. He and two other guys are writing an operating system pretty much from scratch, using K, C, and bits of the linux kernel, (although they’d like to remove those). He showed us how you write applications for this new OS in K, by demonstrating building a text editor. He began from the alpha version of the OS with just a window manager, and a plain new window canvas that didn’t respond to any keyboard or mouse events.

Arthur added a line of K code to let you enter text into the window – a listener to key presses. Then a line of code to move the caret around with the arrow keys. Then a line of code for changing the font size. Then scrolling. Code to handle Ctrl-C and Ctrl-V to copy and paste text…. in the space of less than half an hour, he had an equivalent to notepad working. No compilation, no reloading. And the code was…. phenomenal. You can see a version of it here. I can’t read it really, it looks mostly like line noise to me. All his variable and function names are one or two characters, and K just seems pathologically terse.

I raised my hand and asked Arthur if he thought his code would be more readable if he used longer variable names? He thought for a moment, looking surprised and a little bewildered by the question. Then shook his head and said slowly “No. no. I don’t think I need that. I want to see all my code on the screen at once”. Needless to say, that was a big culture shock moment for me!

The size of the codebase is something all the array language programmers seem really concerned about, even if Arthur’s code is considered extreme even in that community. One thing I did later that week was take a piece of code that is in Robert C. Martin’s book “Clean Code”  (Args.java), as an example of clean Java code, and showed it to the group. There were general exclamations of “aargghh! that hurts my eyes!” but after a little while as I explained the structure of the code they seemed to appreciate it a little better. What they did say that intrigued me though, was that they automatically scanned the page looking for the symbols – the >, !, = signs – the parts that do something, as they put it. The other text they said obscures the structure, it distracts the eye. Yes, that’s right. Having names for the functions and variables makes the code less readable.

KDB+ and Q
KDB+ is a very small and fast commercial database largely used by financial institutions, also originally created by Arthur Whitney. Q is a kind of domain specific language built on top of K, that you use to query the data in a KDB+ instance.

I sat down with Attila Vrabecz, an experienced Q and KDB+ programmer, and we coded together for a couple of hours. We tackled a problem I’d previously coded with Morten in APL, to help me see what was different. There were many similarities – the workflow was the same for example – experimenting in the REPL before transferring the code into a more permanent, reusable form. I noticed Q has many more English words in it, fewer strange symbols, and Attila made more use of library functions than Morten did. It seems Q is designed to be approachable for a former SQL programmer, although once you scratch the surface, it’s much more like APL than SQL.

Test-Driven Development
I gave a talk at the conference about TDD. My aim was to provoke discussion, and argued that writing automated tests using TDD is the best approach. I was definitely successful at sparking a discussion! Actually, it didn’t seem the idea that programmers should write automated tests for their code was all that controversial, especially amongst the more seasoned developers present. We got way more hung up on how large a chunk of code counts as a “unit”, for your unit test, and what clean code looks like in an array language. To my eye, their units are large and their clean code is terse.

A challenge for the future
Dave Thomas, former lead developer for the Eclipse project, and general software visionary, is also an APL and K programmer. He flew in for just one day of the conference, and his talk functioned as the keynote address for the week – it was a clear challenge to the Array Languages community.

Dave painted a vision for the future where people will be living in a sea of big data they don’t understand, and lack adequate tools to query. He sees a great opportunity for array languages, which are generally very good at handling large amounts of data.

He ended his talk with an ambitious challenge to this community to get its act together, start being seen as a credible alternative, and grow. I could only applaud and agree – I found his advice insightful, and I hope the array languages community will do as he suggests.

I spent one very pleasant evening chatting with a woman who is about to embark on her PhD in atmospheric science. She’s hoping to use array languages to help her create software models that will execute quickly on huge arrays of multi-dimensional climate data. Her work sounds fascinating, and I hope it’s a sign of array languages starting to be used beyond their traditional niche in finance.

So I’m leaving the conference carrying a huge tome entitled “A complete introduction to Dyalog APL”, some pieces of code I’ve written, and good intentions to study further. I do find it fascinating that even with the little I know of it, APL allows me to think about and solve a problem differently than I do in Python. I anticipate I’ll find plenty of people in the Array Languages community willing to help me if I do continue to try to learn it. They’re an opinionated, quirky, mature, gentle, yet small bunch of extremely skilled programmers, and I’m glad to have met and coded with them.

A while back, Gojko Adzic published this article “Redefining Software Quality” and I think it’s pretty insightful, pointing out that we often expend a lot of effort ensuring quality at lower levels of the pyramid, when we should perhaps be investing higher up.

I wanted to work out what testing activities you’d do to ensure quality at different levels of Gojko’s Quality Hierarchy, so I began by thinking about the software testing quadrants. The testing quadrants were originally documented by Brian Marick in his blog, and later developed by Lisa Crispin and Janet Gregory in their book “Agile Testing“. (Here is a slide deck by Janet Gregory that gives a summary of the quandrants).

agile testing quadrants

I like the agile testing quadrants because they help you to reason about different testing activities and why you’re doing them. They make it clear that in agile, testing has this big role in supporting the team – spreading knowledge about what’s being built and enabling the team to be agile about feature changes. In more traditional projects, testing focuses almost exclusively on the right side of the quadrants, missing this role entirely.

Anyway, if you put the agile testing quadrants alongside Gojko’s quality hierarchy, I think you get something like this:

quality hierarchy and testing quadrants

Clearly, Q1 and Q2 are all about ensuring functionality basically works, and usually Q2 tests will run against deployed software, (deployed in a test environment). Q4 tests cover things like performance and security, and usability falls under Q3. I think things get a little more tricky for the higher levels though. There’s a distinct danger we’ve just run out of quadrants!

Beyond the testing quadrants

To test a piece of software is useful or successful, you’ll need to look at ideas that are relatively new in Agile. In his article, Gojko of course points out his book on Impact Mapping, and mentions Feature Injection and Lean Startup. Lisa and Janet published their book in 2009, “Lean Startup” by Eric Ries came out in 2011, Gojko’s “Impact Mapping” book is from 2012, and correct me if I’m wrong, but I don’t think there is a book on Feature Injection yet.

Agile ideas are moving on, especially compared with where they were when methodologies like Scrum and XP were originally documented a decade or more ago. I think it’s clear testers need to embrace new ideas and practices too.

If you look at Lean Startup, one of the concrete ideas is to do A/B testing of new features – that is, you divide your users into an “A” group, who see the new feature and a “B” group who don’t. Then you measure how each group behaves, and from that draw conclusions about whether the feature was any good. If the “A” group buys more stuff, spends more time on the site, generally acts like they are happier than the “B” group, you keep the feature, otherwise it gets dropped. It’s a bit like a trial of a new medicine – you compare the patients who get it with a control group who don’t, before you decide whether to approve it.

It seems to me that testers should be involved in designing and performing A/B tests – they’re well positioned with critical thinking skills, knowledge of the user, and technical automation skills. The results of such tests should tell us about whether users find a new feature useful. So that should get us up another level on Gojko’s pyramid:

quality hierarchy and testing quadrants - with A/B

The last level is fairly dependent on how you define success, but for a lot of software products, success means lots of users. Lean startup has another interesting idea here – the “Net Promoter Score”. Basically you ask a small group of initial users if they’d recommend your product, and make a simple calculation to predict if your user base is going to grow when you release it more widely. It’s an idea for what to put at the top level:

quality hierarchy and testing quadrants - with net promoter score

Conclusions

Of course, for many teams, it’s enough of a struggle to test for quality at the lower levels of the pyramid, without worrying about A/B tests or Net Promoter score! James Shore and Diana Larsen have come up with a model of “Agile Fluency” which I think is relevant here. Basically it outlines how, as teams get better at agile, their practices change. The three star level of fluency seems to contain a lot of ideas from Lean Startup, and optimizing for quality at the top two levels of Gojko’s pyramid. At one and two star fluency, delivering business value on the market cadence, just the testing quadrants get you a long way.

Changing testing activities also implies changes for testers. The role of tester has already changed with the advent of agile methods, and I predict it’s going to continue changing. I see a technical tester role appearing that is pretty close to business analyst, doing things like supporting the team with test automation and data analysis of A/B tests. So testers: get a head start and find out about Lean Startup!

I’m very pleased to announce I’ve just published my first Pluralsight course – “Coding Dojo: Test Driven Development“! It’s based on the material in my book, converted to a video-friendly format along with audio commentary. If you purchase a subscription to the Pluralsight course library, you’ll get access to this video course, and hundreds of other courses aimed at software developers.

Since the video is also another iteration of my ideas and material, after I wrote the book, I’ve developed some themes, particularly around deliberate and incidental practice. I also had to focus more, and pick out the really important parts to talk about in the video. If you enjoy the video course, you might find the book contains useful extra material – especially the code kata catalogue.

I really think there is a big need in our industry for professional software developers to learn Test Driven Development and associated skills, and I can’t see it happening via the traditional method of instructor-led two day training courses. TDD is a practical coding skill that you actually have to do in order to get competent at it. It’s a lot more difficult in your average codebase than it needs to be, so a lot of people get discouraged and quickly go back to the way they wrote code before.

The Coding Dojo is a way to start a long term change in yourself and others in your team, and my hope is that the book and the video will provide you with the inspiration and means to get started.

This week I published my first book! I’ve been writing “The Coding Dojo Handbook” since last September, and publishing it as a work-in-progress on Leanpub.com. This week I decided it was time to declare it completed, since I think it hangs together as a whole book, and is useful in the role I imagined for it. In other words, I think this book has everything it needs to be a good starting point for someone setting up a new coding dojo, or for someone experienced in running one already, looking for ideas for new katas and collaborative games. I hope you’ll consider getting a copy if you’re in either of those situations!

Now the book is finished, I have to decide whether to look for a “real” publisher, or whether to just continue to sell it on leanpub. My current feeling is that my target audience, (programmers), are quite comfortable buying an ebook, and having a paper copy isn’t really a priority. The advantage of a publisher might be more sales channels, bookshops etc, and more copies sold overall. I’d also get a considerably lower proportion of the sale price. I’ve noticed that several authors I respect – people like Brian Marick and Roy Osherove – are publishing their newer titles exclusively on leanpub.com. So my current plan is to stick with leanpub and see how things develop.

I had originally planned a few more chapters, about London School TDD, and Approval Testing. When I started writing these chapters, I found I had far more to say than I had anticipated, and it didn’t seem to me that the material really fitted into this book. So what I’ve done is started a new book project, called “Mocks, Fakes and Stubs”

Right now it’s fairly small, more a pamphlet than a book, and I’m not charging a lot of money for it. If, as I hope, there is interest, I plan to add more material over the next few months. The focus is on showing TDD techniques using some of the code katas from “The Coding Dojo Handbook“. I’m hoping the new book will have the feeling of pair programming with an experienced coder, explaining the theory of a technique at the same time as demonstrating it.

I’ve got a couple of workshops coming up, at XP2013, when I’ll be doing research for my new book. Basically I’ll be using code katas to explore TDD techniques like Outside-In, Approval Testing, and Given-When-Then style BDD tests.

So my first book is finished, and I have a new book project to occupy my time!