Posts tagged ‘design’

I’ve been interested for a while in the relationship between TDD and good design for a while, and the  SOLID principles of Object Oriented Design in particular. I’ve got this set of 4 “Racing Car” exercises that I originally got from Luca Minudel, that I’ve done in coding dojos with lots of different groups. If you’ve never done them, I do recommend getting your editor out and having a go, at least at the first one. I think you get a much better understanding of the SOLID principles when you both know the theory, and have experienced them in actual code.

I find it interesting that in the starting code for each of the four Katas there are design flaws that make it awkward to write unit tests for the code. You can directly point to violations of one or more of the SOLID principles. In particular for the Dependency Inversion Principle, it seems to me there is a very direct link with testability. If you have a fixed dependency to a concrete class, that is always going to be harder to isolate for a unit test, and the Tyre Pressure exercise shows this quite clearly.

What bothers me about the 4 original exercises is that there are actually 5 SOLID principles, and none of them really has a problem with the Liskov Substitution Principle. So I have designed a new exercise! It’s called “Leaderboard” and I’ve put it in the same git repository as the other four.

I tried it out last week in a coding dojo with my colleagues at Pagero, and it seemed to work pretty well. The idea is that the Liskov principle violation means you can’t propely test the Leaderboard class with test data that only uses the base class “Driver”, you have to add tests using a “SelfDrivingCar”. (Ok, I confess, I’ve taken some liberties with what’s likely in formula 1 racing!) Liskov says that your client code (ie Leaderboard) shouldn’t need to know if it has been given a base class or a subclass, they should be totally substitutable. So again, I’m finding a link between testability and good design.

Currently the exercise is only available in Scala, Python and Java, so I’m very open to pull requests for translations into other programming languages. Do add a comment here or on github if you try my new Kata.

A little while back I was back in Stockholm facilitating another Code Retreat, (see previous post), this time as part of Global Day of Code Retreat, (GDCR). (Take a look at Corey Haines’ site for loads of information about this global event that comprised meetings in over 90 cities worldwide, with over 2000 developers attending.)

I think the coding community could really do with more people who know how to do TDD and think about good design, so I’m generally pretty encouraged by the success of this event. I do feel a bit disappointed about some aspects of the day though, and this post is my attempt to outline what I think could be improved.

I spent most of the Global Day of Code Retreat walking around looking over people’s shoulders and giving them feedback on the state of their code and tests. I noticed that very few pairs got anywhere close to solving the problem before they deleted the code at the end of each 45 minute session. Corey says that this is very much by design. You know you don’t have time to solve the problem, so you can de-stress: no-one will demand delivery of anything. You can concentrate on just writing good tests and code.

In between coding sessions, I spent quite a lot of effort reminding people about simple design, SOLID principles, and how to do TDD (in terms of states and moves). Unfortunately I found people rarely wrote enough code for any of these ideas to really be applicable, and TDD wan’t always helping people.

I saw a lot of solutions that started with the “Cell” class, that had an x and y coordinate, and a boolean to say whether it was alive or not. Then people tended to add a “void tick(int liveNeighbourCount)” method, and start to implement code that would follow the four rules of Conway’s Game of Life to work out if the cell should flip the state of its boolean or not. At some point they would create a “World” class that would hold all the cells, and “tick()” them. Or some variant of that. Then, (generally when the 45 minutes were running out), people started trying to find a data structure to hold the cells in, and some way to find out which were neighbours with each other. Not many questioned whether they actually needed a Cell class in the first place.

Everyone at the code retreat deleted their code afterwards, of course, but you can see an example of a variant on this kind of design by Ryan Bigg here (a work in progress, he has a screencast too).

Of course, as facilitator I spent a fair amount of time trying to ask the kind of questions that would push people to reevaluate their approach. I had partial success, I guess. Overall though, I came away feeling a bit disillusioned, and wanting to improve my facilitation so that people would learn more about using TDD to actually solve problems.

At the final retrospective of the day, everyone seemed to be very positive about the event, and most people said they learnt more about pair programming, TDD, and the language and tools they were working with. We all had fun. (If you read Swedish, Peter Lind wrote up the retrospective in Valtech’s blog) This is great, but could we tweak the format to encourage even more learning?

I think to solve Conway’s Game of Life adequately, you need to find a good datastructure to represent the cells, and an efficient algorithm to “tick” to the next generation. Just having well named classes and methods, although a good idea, probably won’t be enough.

For me, TDD is a good method for designing code when you already mostly know what you’re doing. I don’t think it’s a good method for discovering algorithms. I’m reminded of the debate a few years back when Peter Norvig posted an excellent Sudoku solver (here) and people thought it was much better than Ron Jeffries’ TDD solution to the same problem (here). Peter was later interviewed about whether this proved that TDD was not useful. Dr Norvig said he didn’t think that it proved much about TDD at all, since

“you can test all you want and if you don’t know how to approach the problem, you’re not going to get a solution”
(from Peter Siebel’s book “Coders At Work”, an extract of which is available in his blog)

I felt that most of the coders at the code retreat were messing around without actually knowing how to solve the problem. They played with syntax and names and styles of tests without ever writing enough code to tell whether their tests were driving a design that would ultimately solve the problem.

Following the GDCR, I held a coding dojo where I experimented with the format a little. I only had time to get the participants do two 45 minute coding sessions on the Game of Life problem. At the start, as a group we discussed the Game of Life problem, much as Corey recommends for a Code Retreat introduction. However, in addition, I explained my favoured approach to a solution – the data structure and algorithm I like to use. I immediately saw that they started their TDD sessions with tests and classes that could lead somewhere. I feel that if they had continued coding for longer, they should have ended up with a decent solution. Hopefully it would also have been possible to refactor it from one datastructure to another, and to switch elements of the algorithm without breaking most of the tests.

I think this is one way to improve the code retreat. Just give people more clues at the outset as to how to tackle the problem. In real life when you sit down to do TDD you’ll often already know how to solve similar problems. This would be a way for the facilitator to give everyone a leg-up if they havn’t worked on any rule-based games involving a 2D infinite grid before.

Rather than just explaining a good solution up front, we could spend the first session doing a “Spike Solution”. This is one of the practices of XP:

“the idea is just to drive through the entire problem in one blow, not to craft the perfect solution first time out.”
From “Extreme Programming Installed” by Jeffries, Anderson, Hendrickson, p41

Basically, you hack around without writing any tests or polishing your design, until you understand the problem well enough to know how you plan to solve it. Then you throw your spike code away, and start over with TDD.

Spending the first 45 minute session spiking would enable people to learn more about the problem space in a shorter amount of time than you generally do with TDD. By dropping the requirement to write tests or good names or avoid code smells, you could hopefully hack together more alternatives, and maybe find out for yourself what would make a good data structure for easily enumerating neighbouring cells.

So the next time I run a code retreat, I think I’ll start with a “spiking” session and encourage people to just optimize for learning about the problem, not beautiful code. Then after that maybe I’ll sketch out my favourite algorithm, in case they didn’t come up with anything by themselves that they want to pursue. Then in the second session we could start with TDD in earnest.

I always say that the code you end up with from doing a Kata is not half as interesting as the route you took to get there. To this end I’ve published a screencast of myself doing the Game of Life Kata in Python. (The code I end up with is also published, here). I’m hoping that people might want to prepare for a Code Retreat by watching it. It could be an approach they try to emulate, improve on or criticise. On the other hand, showing my best solution like this is probably breaking the neutrality of how a code retreat facilitator is supposed to behave. I don’t think Corey has ever published a solution to this kata, and there’s probably a reason for that.

I have to confess, I already broke all the rules of being a facilitator, when I spent the last session of the Global Day of Code Retreat actually coding. There was someone there who really wanted to do Python, and no-one else seemed to want to pair with him. So I succombed. In 45 minutes we very nearly had a working solution, mostly because I had hacked around and practiced how to do it several times before. It felt good. I think more people should experience the satisfaction of actually completing a useful piece of code using TDD at a Code Retreat.

I’ve been working full time in Ruby now for about a month, and I think I’m beginning to get the hang of it. In many ways it’s not so different from Python.

I took some of the code I talked about struggling with in my last post, and set about translating it into Python. I wanted to see if the bugs were more obvious in a language I was more familiar with. Unfortunately before I was finished with the translation, my machine died and refused to restart, complaining about a disk error… so I may well have lost all my work. (ARRRGH!!) Anyway, before that happened, I was getting the feeling that the bugs weren’t very obvious, even in a language I knew better.

What I did find though, was that it was quite hard to make the methods as short in Python as they were in Ruby. I could make them as short, the language has the necessary features to do it, but when I did so, they just stopped looking like good Python to me.

The other thing that I found was that RSpec lets you write really readable and well organized tests. Translating them into unittest made them just a travesty of their former selves.

I’ve just listened to this talk by Gary Bernhardt, who is experienced in both Python and Ruby, and who has clearly thought quite hard about these kinds of issues and knows both languages very well. He has even tried to write a RSpec clone for Python, called Mote. (He says himself that it isn’t as good as RSpec, and that it is because Python lacks blocks, and won’t let him monkeypatch core classes).

Anyway, about half way through the talk, Gary shows this code example, first in Python:

    for obj in (
        for id in ids)
   if obj)

Then the equivalent code in Ruby: do |id|
 repository.retrieve(id) do |obj|

Gary makes the point that the Ruby code is easier to read – you can follow it from top to bottom and see what it does. The thing is, I don’t think many people would write code like that in Python. I might write it more like this:

objs = [repository.retrieve(id) for id in ids]
objs = filter(lambda x: x, objs)
names = [ for obj in objs]

This code is four statements long, rather than one, and has two local variables (“objs” and “names”) which the other two code snippets lack. In real code, you would probably be able to come up with rather more descriptive names, drawn from the problem domain. When I compare this code with the Ruby, I don’t think it is any less readable. The filter(lambda x: x, objs) is not as nice as the Ruby call to “compact”, but on the other hand, I think the two additional local variables make it clearer what is going on.

I’m wondering whether the trouble I was having locating bugs in these small methods was because they were cramming so much into one statement, and almost completely lacking in local variables. That seems to be the Ruby way of doing things – maybe I will just get used to it and learn to read it just as well eventually? I guess I am going to find out!

Anyway, I’m really hoping the friendly support technicians manage to save the contents of my hard disk, I want to use the code in an exercise at the upcoming Gothenburg Python Conference – GothPyCon. I am hoping to run a workshop where half the room gets the buggy code with small methods, and the other half gets the same code refactored into longer methods. You get half an hour to find the bugs, then swap to the other codebase and find them again. Then I was hoping to have a discussion about which codebase was easier to debug, and/or split into pairs and re-implement the code from scratch, and see if we could come up with other designs that solved the problem more elegantly and transparently.

If issues like this interest you – design, refactoring and testing -, I really hope you will come along!

Last week at Scottish Ruby Conference I chatted with Brian Marick about software design. The week before that, he had been in Göteborg for Scandinavian Developer Conference, and had spent a morning pair programming with Geoff on TextTest. I took the chance to ask Brian what he thought about text-based testing now that he had seen it in action.

Brian’s view seems to be that text-based testing may be effective as a testing technique, but it just doesn’t offer the design benefits you get with standard TDD. It doesn’t give you guidance about small-scale design decisions, or intice you to structure your code into really small methods and classes. He thought the TextTest codebase wasn’t bad, but that the methods were larger than he would prefer, some classes were doing too much, and some ideas were not expressed clearly.

I’ve previously read about this design ethic of having really really small classes and methods in Bob Martin’s book “Clean Code”. Bob recommends one or two line methods. In contrast, I believe Steve McConnell’s “Code Complete” advocates methods small enough to fit comfortably on one screen. Geoff’s design for TextTest seems to land at about 5 or 6 lines for a typical method, which is somewhere in between.

Brian said the main drawback of code structured largely into small classes and one and two line methods is that it is harder for people who are unfamiliar with the codebase to get to grips with it. You get lost in the trees, and can’t easily get an overview of the forest. The big benefit is that people who are familiar with the code can potentially make sweeping improvements through very small, localized changes.

I’ve had the opportunity to work on this kind of codebase recently, and my experience hasn’t been entirely positive. Just as Brian predicted, I’ve found it hard to get into the code and grasp what it is doing. All the methods are one or two lines, and call each other. My pair and I ended up creating a temporary file where we pasted a whole call chain of about 6 methods so we could see them all at once, and read them in the order they called each other. There were 3 defects hidden in that 15 or so lines of code, and it took us a day or so to identify and fix them all. Yet, the code had been TDD’d, and the methods were well named. On the surface it looked very good. It actually took me some time to convince myself that we genuinely had found a defect, and that we hadn’t just misunderstood what the code was supposed to do.

The trouble seemed to stem from the fact that almost all the tests were for the “happy path” and there were several edge cases they never considered. Also, in one case the test had stubbed out the answer that one of the lower-level methods would return, and provided an answer the real code never gave. It took a long time for us to add missing tests for edge cases, and localize the defects to particular methods in the long call chain.

I’m very interested in whether we would have found it easier to find the defects if the code had been structured as two or three 5 line methods instead of 6 one and two line methods. I’m also interested if we would have found the issues more easily if the code had been built with text-based testing, with fewer, more coarse grained tests, and a few log statements printing key intermediate values. I’m considering getting the original versions of the files out of git and refactoring them to see how else they could have looked.

I don’t want you to conclude that I am against building designs with really small methods, or TDD, or using stubs or anything like that. I think there is value in all these techniques, each can be done well or badly, and you make tradeoffs when you choose your approach. You still need design skills and testing skills, whether you’re doing TDD or text-based testing. If Brian Marick had built TextTest the design might well have turned out differently. I don’t know how much of that would have been because of his use of TDD, and how much because of his skill, and views on design.

I’m actually relishing the prospect of working on more code with very small methods, and using TDD to build on it. I’ve got loads to learn about software design and testing 🙂