Last week I was in Oxford at “Iverson College”, which is a conference on the topic of Array Language Programming. There were about 25 programmers there, most of whom are expert in one or more of APL, J, K, or Q. It’s not my usual comfort zone, put it that way! I’m fairly competent with a number of programming languages, notably Python and Java, but nothing I know is really much like these array languages. It’s been a huge culture shock, but in a good way, I think.
My main discoveries are that Array Programming is different again from Object Oriented Programming and Functional Programming, (although it has a lot in common with functional programming), and that this community contains some exceptional programmers. The total number of array language programmers is however extremely small and their work seems to be pretty much unknown to the wider programming community.
Array Programming Languages
I mentioned before four languages, APL, J, K and Q. They are similar to each other, kind of like Ruby and Python are similar to each other. I’ve gone through an introductory training in each language this week, largely given by the language designers themselves. I’d like to relate a little of what I’ve discovered about them.
This is the oldest of the array languages, invented by Ken Iverson in the 1960s. It’s notorious for using an alphabet of funny-looking symbols to represent the built-in functions. You can try it out at http://tryapl.org – an interactive REPL (Read-Evaluate-Print-Loop) where you can put in snippets of code and see what the symbols do.
I thought at first that APL looked really intimidating and unnecessarily weird. Now having got to know it a little, I can see the benefits to the little symbols. They make the code really concise and unambiguous, and it doesn’t take long to learn their names. Once you can pronounce each symbol in your head as you read the code, it’s not much different from writing out the names in full in the editor.
The variant of APL that most of the conference attendees use is produced by the company Dyalog. I first met the CTO, Morten Kromberg, at an XP conference in 2006. He’s shown me some APL before, but this time I really got a chance to sit down with him and look at how he writes code. Dyalog APL has a powerful IDE including a REPL, where Morten showed me how he plays around with data and code, in order to come up with some useful APL expressions. When he’d got something working, he transfers code from the REPL into a file, to make it re-usable and shareable. It’s a familiar way of working to me, many Python programmers code this way, flipping between the REPL and a script file. It was a real pleasure to code with Morten – he is an extremely skilled programmer. Dyalog APL looks nice too, it has a fully-fledged IDE, and interfaces with .Net, Excel spreadsheets, ASP.net and more. It would fit nicely into the technology stack of many IT departments basically.
This was Ken Iverson’s next language, created together with Roger Hui, who now continues development of it. J is similar to APL in many ways, but is open source, and uses only ASCII characters. They’ve made an effort to make it open and less intimidating to newcomers, and probably for that reason, it’s the one I chose to download and try to learn before the conference.
I met Roger at breakfast on the first day of the conference, knowing nothing about who he was, he just said he was a programmer. I confessed that I’d downloaded J and made some joke about hoping I’d get on better with it than Ron Jeffries. (Ron wrote articles in his blog, about his efforts to learn J, and later gave up, finding it too hard!). Roger genuinely didn’t know who Ron Jeffries is, although he did know of the agile manifesto. He was very kind and concerned to help me to understand J though, (and Ron, if he wants!)
Despite my head start with J, by the end of the conference I found APL code easier to grasp – J seems more extreme to me. Roger calls J “executable mathematical notation”, and I’ve always been a bit more of an engineer than a mathematician.
K was invented by Arthur Whitney, who was also at the conference. I didn’t really get a feel for how the language works, more than that it’s extremely terse.
Arthur gave a talk at the conference, about his new project, KOS. He and two other guys are writing an operating system pretty much from scratch, using K, C, and bits of the linux kernel, (although they’d like to remove those). He showed us how you write applications for this new OS in K, by demonstrating building a text editor. He began from the alpha version of the OS with just a window manager, and a plain new window canvas that didn’t respond to any keyboard or mouse events.
Arthur added a line of K code to let you enter text into the window – a listener to key presses. Then a line of code to move the caret around with the arrow keys. Then a line of code for changing the font size. Then scrolling. Code to handle Ctrl-C and Ctrl-V to copy and paste text…. in the space of less than half an hour, he had an equivalent to notepad working. No compilation, no reloading. And the code was…. phenomenal. You can see a version of it here. I can’t read it really, it looks mostly like line noise to me. All his variable and function names are one or two characters, and K just seems pathologically terse.
I raised my hand and asked Arthur if he thought his code would be more readable if he used longer variable names? He thought for a moment, looking surprised and a little bewildered by the question. Then shook his head and said slowly “No. no. I don’t think I need that. I want to see all my code on the screen at once”. Needless to say, that was a big culture shock moment for me!
The size of the codebase is something all the array language programmers seem really concerned about, even if Arthur’s code is considered extreme even in that community. One thing I did later that week was take a piece of code that is in Robert C. Martin’s book “Clean Code” (Args.java), as an example of clean Java code, and showed it to the group. There were general exclamations of “aargghh! that hurts my eyes!” but after a little while as I explained the structure of the code they seemed to appreciate it a little better. What they did say that intrigued me though, was that they automatically scanned the page looking for the symbols – the >, !, = signs – the parts that do something, as they put it. The other text they said obscures the structure, it distracts the eye. Yes, that’s right. Having names for the functions and variables makes the code less readable.
KDB+ and Q
KDB+ is a very small and fast commercial database largely used by financial institutions, also originally created by Arthur Whitney. Q is a kind of domain specific language built on top of K, that you use to query the data in a KDB+ instance.
I sat down with Attila Vrabecz, an experienced Q and KDB+ programmer, and we coded together for a couple of hours. We tackled a problem I’d previously coded with Morten in APL, to help me see what was different. There were many similarities – the workflow was the same for example – experimenting in the REPL before transferring the code into a more permanent, reusable form. I noticed Q has many more English words in it, fewer strange symbols, and Attila made more use of library functions than Morten did. It seems Q is designed to be approachable for a former SQL programmer, although once you scratch the surface, it’s much more like APL than SQL.
I gave a talk at the conference about TDD. My aim was to provoke discussion, and argued that writing automated tests using TDD is the best approach. I was definitely successful at sparking a discussion! Actually, it didn’t seem the idea that programmers should write automated tests for their code was all that controversial, especially amongst the more seasoned developers present. We got way more hung up on how large a chunk of code counts as a “unit”, for your unit test, and what clean code looks like in an array language. To my eye, their units are large and their clean code is terse.
A challenge for the future
Dave Thomas, former lead developer for the Eclipse project, and general software visionary, is also an APL and K programmer. He flew in for just one day of the conference, and his talk functioned as the keynote address for the week – it was a clear challenge to the Array Languages community.
Dave painted a vision for the future where people will be living in a sea of big data they don’t understand, and lack adequate tools to query. He sees a great opportunity for array languages, which are generally very good at handling large amounts of data.
He ended his talk with an ambitious challenge to this community to get its act together, start being seen as a credible alternative, and grow. I could only applaud and agree – I found his advice insightful, and I hope the array languages community will do as he suggests.
I spent one very pleasant evening chatting with a woman who is about to embark on her PhD in atmospheric science. She’s hoping to use array languages to help her create software models that will execute quickly on huge arrays of multi-dimensional climate data. Her work sounds fascinating, and I hope it’s a sign of array languages starting to be used beyond their traditional niche in finance.
So I’m leaving the conference carrying a huge tome entitled “A complete introduction to Dyalog APL”, some pieces of code I’ve written, and good intentions to study further. I do find it fascinating that even with the little I know of it, APL allows me to think about and solve a problem differently than I do in Python. I anticipate I’ll find plenty of people in the Array Languages community willing to help me if I do continue to try to learn it. They’re an opinionated, quirky, mature, gentle, yet small bunch of extremely skilled programmers, and I’m glad to have met and coded with them.