Archive for the ‘Experience Report’ Category

What is important in your daily work as a software developer?

That was the question I posed on Twitter yesterday. I got a huge pile of answers, some of them surprised me! This article is my way of thanking everyone for sharing their ideas.

You can see the whole picture in this pdf which you are free to download.

Of course I asked this question mostly because I was curious whether anyone else would say the same things as me. I primed the discussion with my personal list, asked people to ”like” any responses they agreed with and reply with their own additions.

I was really pleased with how many people took the trouble to write something. Some of the things I hadn’t thought of at all.

I went through all the replies and summarized the main ideas on postit notes. (I didn’t try to transcribe every one). Then I grouped them by area and added some colours and headlines. The picture above is what I came up with in the end. Eight areas that emerged:

Human needs
Teamwork
Learning and professional development
Making progress and achieving a Flow state
Meaningful relationship with end users and customers
Building something important
Good code and coding environment
Shared coding rules and architecture

Some ideas would fit in more than one area of course, in which case I tried to put them near the boundary.

The main point that surprised me was the number of people who wanted peace and quiet – long stretches of uninterrupted time. I interpret that as a desire to reach the ’flow’ state that you get when you become so absorbed in what you’re doing that you are no longer conscious of time passing. It’s a wonderfully rewarding state to be in. I guess I never found silence to be a prerequisite for that, I get it in a pair or ensemble too.

The most popular kinds of ideas, that got the most likes, were mostly around the ’human needs’ category. We’re all people, including software developers 🙂

I’m interested in this question not only from idle curiosity. I regularly give talks at conferences and company events attended by lots of software developers. I want to understand the kinds of challenges developers have in their daily work and connect that to the things I can help them with. I don’t want to just cause additional boring meetings taking people away from fun coding!

Thanks to everyone who contributed to the discussion on Twitter. I think I learnt something from it, I hope you like my summary.

Posted by Emily Bache on 2022-01-05 at 13:41 under Experience Report.
Comments Off on What is important in your daily work as a software developer?.

Effective Technical Agile Coaching

Note: this article was first published on Praqma’s website

Experiences Pairing with Llewellyn Falco

How does a Technical Agile Coach improve work in a development team? When Llewellyn Falco asked me to pair with him at a client I jumped at the chance to see how effective mob programming is for introducing technical agile practices.

Day one: head first into the mob

My first day as a visiting Technical Agile Coach begins with coding, in a mob. All the developers in the team and I enter our names into a mob timer. It will prompt us to switch roles every 5 minutes, so we will all take a turn at the keyboard. Llewellyn takes the facilitator role, sitting at the back. I find mob programming is great way to get to know a team and their codebase. After only a few minutes, I take the Driver role, which forces to very quickly pick up what’s going on. I need to understand both the current task, the IDE and the particular code we’re working on. The Navigator prompts me and the whole team helps me to find where to click and what to type.

Llewellyn has worked with this client on and off for half a year, and he was here all the previous week. This team has quite a bit of experience mob programming with him and a couple of other visiting coaches. Llewellyn gets involved facilitating the mob from time to time, usually to draw our attention to some improvement we could make in the code, or in the way we’re working together. When the team is stuck, or going in the wrong direction, he will step in and take the Navigator role for a while. That can happen when we’re doing a tricky refactoring, or if he spots we aren’t taking full advantage of our tools.

After 90 minutes the team has to go to their stand-up meeting. Usually we’d mob for about 2 hours with a team, but today we have a longer breathing space before the Learning Hour. Llewellyn and I take the chance to discuss the challenges this team is facing and how we can coach them more effectively when we meet them again tomorrow. Later this week he’s going to get me to take the Facilitator role and he’ll be in the mob instead. Next week he’ll fly home and leave me here for a week coaching by myself, so we’re already preparing for that handover.

The Learning Hour

The Learning Hour is a fixture in the calendar of everyone in the department. It’s one hour devoted to learning new techniques in software development, every day, led by Llewellyn or a visiting coach like me. Not everyone can make it every day, so the planned topics are circulated in advance with an indication of whether it’s a coding session or not. When it’s coding not as many managers/scrum masters/product owners turn up. Actually some of them do enjoy attending these more technical sessions to get a better insight into what challenges their developers are facing.

The lesson I’ve decided to begin with uses the Tennis Refactoring Kata, an exercise I’ve done many times with teams. I am hoping it will be fun and not too difficult, especially compared with the production code they are used to. The exercise has comprehensive unit tests that quickly fail if you make a mistake. The developers aren’t used to having that, and they soon find they like having fast feedback on the accuracy of their work.

Lunch dating

Lunch is next up, and it turns out I’m not eating with Llewellyn. He’s set me up to go out with one of the developers from a team I won’t otherwise be working with. It’s a very deliberate policy Llewellyn has to help me to get to know the wider organization outside of the teams I mob with. He’s found that daily one-on-one chats with key people preferably in a social setting, is an effective way of becoming well-connected in an organization. Today it also gives me an opportunity to offer some career advice to an ambitious developer in a similar position to where I was ten or so years ago.

After lunch Llewellyn introduces me to a second team that I’ll be mob programming with, then leaves me to it. He’s confident this team is working well together and it will be straightforward for me to facilitate without him. We get stuck straight in to some front-end development work, improving a new account creation form. My javascript is a little rusty, but I find that’s not really a problem – they know their tools. I just need to keep an eye on how the mob is running, and think a little outside the box. I spot that what they’re doing will likely break some of the automated GUI tests, so we have a chat about how to handle that.

In the meantime, Llewellyn is working with a different team who are new to him. He begins teaching them the basics of mob programming, and getting to know their specific challenges. A future visiting coach might get to work with them once they’re up to speed.

Managing up and down

At the end of the session, we take a short break together and discuss how things are going. We have a little slack in our schedule, and Llewellyn spots one of the senior managers having a coffee. He takes the chance to greet him and book a short meeting the following day. Llewellyn’s heard there are plans afoot to break apart some of the teams, and he wants to ask the boss to protect a particular team from that reorganization because they are really starting to gel and mob well together.

For the third mobbing session of the day Llewellyn and I are pair-coaching again. It’s similar to the first session. All the teams we’re working with are really struggling with code quality and lack of automated tests. Even if ostensibly we’re working on adding a feature, most of the time we’re addressing code smells and adding unit tests.

The last thing Llewellyn and I do before we leave for the day is send a very short email to the department managers – the people authorizing our invoices. We write one sentence about each mobbing session and the learning hour, summarizing what we’ve done. It makes our work more visible to the decision makers.

Moving in the right direction

I’m impressed with how much Llewellyn and the other visiting coaches have already achieved with these developers. Most people have a positive, curious outlook, and consider learning new skills to be a normal part of work. Llewellyn will often pause the mob timer, pull out his laptop, and spend five minutes showing some relevant presentation slides. Some of the developers here know certain refactorings so well I find myself learning techniques from the teams not just from Llewellyn.

Optimising for when we’re not there

That’s not to say there aren’t problems. The codebase is still large, badly structured, slow to build, and lacking automated tests in many areas. Many developers in the mobbing teams are relatively new to the company, and most have little previous experience of Test-Driven Development or refactoring techniques. However, their starting point isn’t what’s important, the crucial thing is that they are continuing to improve and learn. What we’re doing here is creating momentum in the right direction, teaching skills and strategies so people can continue to make things better when we’re no longer present.

The wider organisation is also just beginning to adopt Agile practices and DevOps, and it’s not working smoothly yet. Many teams are still sitting in cubicles. The test, build and deployment infrastructure relies too much on manually executed steps.

There are several other agile coaches here at the same time as Llewellyn and me who are more focussed on improving process and product management. Later in the week I attend a sprint demo where there is much talk of A/B testing and hypothesis-driven development. People seem really keen to understand their customer needs and to verify they are building the right thing. The thing is, the starting point is less important than the direction of travel.

After a week of pair-coaching I feel confident I can pick up and continue Llewellyn’s work with the development teams at this client. I’ve got to know the people, the particular challenges they face, and have a structure in place that will let me continue the changes Llewellyn is initiating. I’m actually quite surprised how smoothly the handover has gone.

Visiting Technical Agile Coach

Not many coaches are generous enough to invite visitors to pair with them. Llewellyn has a whole list of people he’s inviting to visit in 2018, and I feel lucky to be one of them. It seems to me that everyone’s a winner. Of course Llewellyn wants to show off and spread his coaching methods to the visitors, but also to learn from them and get their feedback on his work. At the same time the client gets the benefit of teaching & advice from many different visitors. They do have to pay two coaches rather than one during the week we overlap, but having such a smooth handover gives them the ability to get more weeks of coaching than Llewellyn could provide by himself. So far the client seems to feel it’s worth the additional expense.

Taking what I’ve learnt home with me

I plan to start recommending this style of Technical Agile Coaching to my clients back home in Sweden. For a start I really enjoy coding, mobbing and teaching all day, but more importantly it seems to be more effective than what I’ve been doing until now. For years I’ve used Coding Dojos to teach the theory and practice of TDD, but all too often the initiative fizzles out and the dojos stop happening when I’m no longer there.

I think this combination of daily mob programming sessions with a learning hour is particularly effective at teaching both theory and practice. The lunches, the presence of other agile coaches, and sending daily summary mails connects me better with the wider organization. This has been a really valuable two weeks for me. I feel pair-coaching with Llewellyn has taught me an effective way to introduce technical agile practices and change developer behaviour for the better.

Posted by Emily Bache on 2018-10-02 at 09:11 under Experience Report.
Tags: agile, teaching programming
Comments Off on Effective Technical Agile Coaching.

Testing with Swedish Personal Numbers

Fictitious people that might get real – and sue you!

Finding realistic data for testing is often a headache, and a good strategy is often to fabricate it. But what if your randomly generated data turns out to belong to a real person? What if they complain and you get fined 4% of global turnover?!

Please note: This story was originally posted on Praqma’s site in October 2017.

I am of course referring to the new GDPR regulations which will come into force next year. The aim of them is to prevent corporations from misusing our personal data, which generally seems like a good thing. Many companies use copies of production databases for testing new versions of their software. Sometimes that means testers and developers get access to a lot of data that they probably shouldn’t be able to see. Sometimes it means you’re testing with old and out of date versions of people’s information, or even testing with customers who cancelled their contracts with you long ago.

Generally I’m in favour of tightening up the rules around this, since I think it will better protect people’s privacy. I think it will also force companies to improve their testing practices. In my experience it’s easier to get reliable, repeatable automated tests if each test case is responsible for creating all the test data it uses.

If you instead write a lot of tests assuming what data is in the database from the start, maintaining that data can get really expensive. I’ve seen this happen first hand – at one place I worked, maintaining test data became such a big job that a whole test suite had to be thrown away along with the data!

Data-Driven Testing

At my last job, part of the test strategy involved data-driven testing. In my test code I’d access an internal API to create fictitious companies in the system. Then, simply by varying the exact configuration of these companies, I could exercise many different features of the system under test. Every case test created different, unique companies. This meant you could run each case as many times as you liked. You could also run lots of tests together in parallel, since they all worked with different data. Each time you ran the tests, you had the opportunity to start over with a fresh, almost empty database – a small memory footprint that meant a fast system and speedy test results.

Fictitious People

When I came to my current client I was planning to use a similar approach. The difference here is that instead of fictitious companies, I needed to create fictitious people. So, I got out my random number generator and started creating fictitious names, addresses, and, of course, Swedish Personal Numbers. That’s when things started getting tricky… Let me introduce those of you who aren’t Swedish to the Personal Number.

It’s a bit like a social security number, unique for each person, and the government and other agencies use it as a kind of primary key when storing your data. If you know someone’s number, (and you know where to look), you can find out their official residence, how much tax they paid, credit rating, that sort of thing.

You’ll understand why, then, we started getting worried about GDPR at this point. A Swedish personal number is definitely a ‘personally identifiable’ piece of data, and GDPR has a lot to say about those.

A primary key containing your age

The other thing about the personal number is that it’s not just a number. When they first came up with the idea, in the 1940s, they thought it would be a great idea to encode some information in this number. (I’m not sure database indexing theory and primary keys were very well understood then!) So, they decided that the first part should be your birthdate, then three digits specifying where in Sweden you were born, and the last figure is both a checksum, so numbers can be validated, and also specifies your gender.

These days, lots of Swedish people are born outside Sweden, and not everyone has a gender that matches the one they had at birth, so they had to relax some of these rules. Sometimes too many children are born on one day and they run out of numbers for it, in which case you might get assigned the day before or after. However, the number still always includes your birth year, and hence your age.

Never ask a woman her age, (just get her personal number)

These days it’s quite common for retail chains to ask you to join a ‘customer club’ so you can accrue points on your purchases. When you buy something the assistant will often ask for your club membership number. Almost always the membership number is the same as your personal number, so you end up telling them (and everyone in the queue behind you) exactly how old you are. It’s disconcerting, if not rude!

Fictitious Test People get real and sue you

Back to my test data problem. I’m generating fictitious people, but the system under test both validates the checksum, and uses the personal number to determine their age. I have to generate realistic numbers from the last hundred years or so. This is where things start to get tricky. Someone in the legal department objected to this. He told me that if I was to generate a number belonging to a real person we could really get into trouble because of GDPR. If that person discovered we were using their personal number, and had changed their name to “Test Testsson” living at “Test gata 1, 234 56 Storstad”, that might not be too bad. Unfortunately, if we were using them for a tricky scenario, they might also discover our test database linked their personal number to a dreadful credit rating and a history of insolvency. At that point they might start to get insulted. In the worst case, they could sue us and GDPR would mean we’d have to pay them an awful lot of money!

Keeping fictitious test people imaginary

However unlikely that scenario might be my client is understandably unwilling to take the risk, and I really need to restrict the test system to genuinely fictitious personal numbers.The authorities do provide a list of these exactly for testing purposes, but the trouble is there are only about 1000 numbers on the official list. With the kind of automated testing I’m planning to do that is not enough. Not nearly enough! Assuming my system has about 500 features, I want to write 5 scenarios per feature, I have 25 developers running all the tests 25 times a day… I need something like half a million unique numbers. Less if I can wipe the db more often than once a day, but still. A thousand doesn’t go very far.

So, here’s a startup idea for you – I’m pretty sure my client isn’t the only company out there who would pay good money for a list of a million personal numbers, for people aged less than 100, that are guaranteed not to exist in reality. If you have such a list, and would be willing to maintain it for me such that it remains guaranteed fictitious, please get in touch!

Playing with time

Right, so I’m back on generating personal numbers again, not having found that list yet. I have a couple of ideas about how to keep them fictitious though. The first idea is to use really old numbers. While I’m testing the system, I can set the clock back to 1900, and only use generated numbers from 1800-1890. Since everyone over 125 can be assumed to have passed on, they are unable to sue us. For some scenarios I need under 18 year olds, and by changing the date that the system clock thinks is ‘today’, I can generate children born in the 1870s. Fictitious children complete with bank accounts, mobile phone contracts, student debt, acne… and everything will work just fine!

Hack the checksum

The other idea I had was to use personal numbers with invalid checksums. After the date of birth and the three random numbers, the last figure in the personal number is calculated from the others, using the Luhn algorithm. While I’m testing the system, I can modify the ‘PersonalNumber.validateChecksum()’ function to instead accept all invalid numbers, and reject all valid ones. Then all my tests can generate perfectly invalid personal numbers, and the test database will get filled with people that definitely can’t exist in reality.

As with any well-designed system, there is only one place in the code where I calculate the checksum, so it’s actually a really small change just for testing. It also comes with insurance – if I ever accidently deploy this code change in production, absolutely everything will stop working, since all those valid numbers already in the production database will suddenly get rejected. I think we’d notice pretty quickly if that happened!

Is there a better way?

I haven’t actually implemented any of these strategies yet – we’re still at the planning stage. I’d be very interested to hear if anyone else has solved this problem in a better way. In the meantime, I’ve written some code that can generate personal numbers with and without valid checksums, and also lets you specify an age range – it’s available on my github page.

We can’t be the only people worried about GDPR and personal numbers. That threat of a fine of 4% of global turnover is a pretty mighty incentive to focus some effort on cleaning up test data.

Posted by Emily Bache on 2018-01-02 at 15:19 under Experience Report.
Tags: clean code, testing
Comments Off on Testing with Swedish Personal Numbers.

Continuous Delivery at Pagero

Note: This post first appeared on Pagero’s blog

One of the questions that Kent Beck asked when he was developing the eXtreme Programming development methodology, was what happens if we turn the dials up all the way to 10? Take a practice we know is good, and do more of it? Practices like Test-Driven Development and Pair Programming are what he came up with, starting from manual testing and code review.

In the same way, Continuous Delivery is what you get if you turn the dials to 10 on your annual release cycle. You get to the point that you are pushing out new code to users, many times a day.

“Shortening the release cycle like this has a lot of advantages, especially around risk and quality.”

LOWER RISK AND HIGHER QUALITY WITH SHORTER RELEASE CYCLES

Shortening the release cycle like this has a lot of advantages, especially around risk and quality. Basically, you’re decreasing the batch size, a well-known tenet of lean manufacturing. If each new release contains fewer changes, then you have fewer places to look when things go wrong, so finding bugs is easier. You also lower the risk that any individual batch has a defect in the first place. By having an engineering setup that allows you to make code changes at the drop of a hat and push them out to production easily, you facilitate getting fixes out quickly.

So the upshot is quality problems surface sporadically instead of all at once, and are more easily dealt with. It’s an attractive prospect for us, especially with the growth in traffic we’re experiencing. Every time we have a defect in production, it affects a proportion of our customers, and the number of customers is increasing all the time. If we had a small bug a year ago that affected one or two customers, today the same bug might affect tens or even hundreds.

FROM MONOLITH TO MICROSERVICES FOR GREATER FLEXIBILITY

At Pagero, historically we’ve been pushing out a new version of our product “Pagero Online”, about once a month. We’ve been able to sustain that since about 2007. So when we began looking at Continuous Delivery, about three years ago, we were starting from a fairly good position. We’ve experienced steady growth in transactions through our cloud platform since the start, and it was in early 2014 we started switching over our architecture from a clustered monolithic JEE instance, to distributed microservices (see my previous article).

We needed to do this, in order to scale out our system horizontally, and handle the increasing traffic. One of the other benefits of microservices though, is you can deploy services independently of one another, and if you do it right, you can deploy new code without stopping traffic to the site.

“One of the other benefits of microservices, is you can deploy services independently of one another.”

FROM MONTHLY SERVICE WINDOWS ON SUNDAYS…

Our old monthly release cycle was based on having a ‘service window’, usually on a Sunday morning, where we could stop all the traffic, take a backup of the database, roll out the new version of the monolith, then bring everything back up again. You’ve got the database backup to fall back on, if something goes wrong with the update. You can easily roll everything back to the state it had before the service window.

…TO SEVERAL ROLLOUTS A WEEK

So of course, initially the microservices we had were fairly peripheral to the main function of our platform, and it wasn’t a huge risk to roll out new code without the safety of a service window. So we built deployment tools that allowed us to do that. All our microservices run with at least two instances, so an update consisted of taking each instance down in turn, replacing it with the new version. If something goes wrong, it’s not hard to roll back to a previous version. It’s a little more problematic to restore previous state, but generally we have good mechanisms to re-submit failed transactions once the service is working again.

So these days we roll out new versions of our microservices several times a week, when new features are ready, and rarely have any difficulties with this. The need to roll back does occur occasionally, but more often we can ‘roll-forward’ and deploy a newer version with a fix.

“These days we roll out new versions of our microservices several times a week, when new features are ready.”

MANY REASONS TO CONTINUE ON THIS PATH

With our former monolith, the situation is a little different though. Any changes that touch the database are deemed too risky to deploy without first taking a backup, and that currently requires a service window. We’ve got so used to frequently pushing out new versions of the microservices, and seen the benefits of that, that we’d like to do the same with the former monolith.

We also have good business reasons for wanting to release without having a service window – for a start our traffic is growing at such a rate, we can ill afford any downtime. Perhaps more importantly, as we get customers in more parts of the world, a Sunday morning is no longer a ‘quiet’ time of the week when it’s relatively ok to suspend our service. In some Arab countries where we do business, Sunday is the first day of the working week.

THE SHIFT TO CONTINUOUS DELIVERY HAS STARTED

Now we’ve gained some experience with Continuous Delivery of our microservices, it’s time to do the same with the whole Pagero Online platform, including our old monolith. So I look forward to being able to soon report that we’ve got the dials going all the way up to 10 and we are deploying any part of our system at any time.

Posted by Emily Bache on 2017-02-11 at 09:50 under Experience Report.
Tags: agile, CD, microservices
Comments Off on Continuous Delivery at Pagero.

From Monolith to Microservices

Please note: this article was originally published on Pagero’s site.

At Pagero we are very proud of the technical architecture of our flagship product, Pagero Online. We’re successfully handling more document transactions than ever, as we see an ever increasing demand for e-document services. In this article I’d like to tell you a little about the journey we’ve taken, from humble beginnings almost ten years ago, to the present day and beyond. I’ll be talking a little about the technology stack we’ve chosen, including the business and technical reasoning behind our choices. If you’ve ever worked on a high availability, cloud based platform handling millions of events, or aspire to do so, you could be interested in our story.

This spring I was at the Craft conference in Budapest, which I thoroughly recommend by the way. There was a full program, with lot of great sessions, and interesting speakers. I did notice, browsing the program beforehand, that there were a lot of talks about Microservices, and Docker. Everyone seemed to have an opinion on the best deployment options, how to manage distributed data, building, testing, logging… This is clearly the hip and trendy way to build systems these days. I found this quite gratifying, since at Pagero we’ve been using a Microservices architecture for some time now, and have been using Docker in production since early 2014. It’s become our everyday life, not some hyped trend that we just heard about. Our reasons for going with Docker and Microservices are firmly rooted in the needs of our business.

Let me explain. Pagero Online is a cloud-based platform for exchange of electronic documents between businesses, for example invoices and orders. The point is, our customers can send their documents to us in whatever format their internal system produces, and we will deliver it in the format the recipient finds easiest to process in their internal systems. It’s clearly a valuable service, since we have an impressive year on year growth in document transactions.

The growth illustrates the challenge we’ve been meeting successfully for several years now – to scale our cloud system to handle ever increasing traffic. It’s of course a great problem to have, and we in the R&D department have worked hard to keep everything running smoothly throughout. The architecture we had when we started, is not the architecture we have now.

IN THE BEGINNING

Back in 2007, Enterprise Java Beans were the thing to do, and we felt confident we were building a future-proof, scalable system, using a JBoss container talking to a PostgreSQL database. Moore’s law meant that we could initially scale just by buying a bigger machine now and then. As time went by, we needed more, and started using the clustering capabilities built into the J2EE platform – i.e. several instances of the same code, receiving requests via a load balancer. At some point in about 2012 we realized this approach could no longer handle the increase in traffic that we were experiencing. We could no longer just add new instances of the same code, the slowdown from the communication overhead between them would be greater than the speedup from the increased CPU power. We needed to give more CPU power to just a few parts of the code that were doing the most intensive processing, without also hitting the communication bottlenecks.

ENTER MICROSERVICES AND DOCKER

Everything was pointing to a need to break apart our monolith into more manageable pieces. Microservices and Docker seemed the perfect match to our problems, so we spent the next year or so building the infrastructure needed. In February 2014 we deployed our monolith, packaged in a Docker container, together with some essential services for monitoring, service discovery, and message passing, (with protobuf over Rabbit MQ). Over the following months, the whole of our R&D department completed a course in the Scala programming language, and we built and deployed several more services for new features in the system. It worked! Since 2014 we have been able to quickly grow to about twenty services a year later, and sixty today.

We’ve realized the Microservices architecture enabled organizational streamlining too. Over the years our development team has grown from a handful of developers in the same room, to about 30 people split across three time zones. By breaking up the codebase, we can also divide up the development work more efficiently. We now have half a dozen ‘devops’ teams each responsible for a handful of Microservices. Both new and seasoned developers are more productive when working in these smaller codebases.

SCALING THE DATABASE

It was in around mid-2015, however, we started to see where the bottleneck had moved to, now the application code was performing better. Our trusty PostgreSQL database was handling a good many more gigabytes of data than ever before, and some transactions were getting a little slow. We concocted a plan to split it up too, just like we were doing with the monolith of code. We settled upon Cassandra and worked out how we were going to safely migrate all the document data out of Postgres, and into this distributed data store. The rest of the data will remain where it is, but just taking out the documents should free up a good deal of space, and release the main bottleneck. We of course need to do this without disrupting our service in any way, so one way to reduce the risk is to run the new Cassandra database in parallel with the existing Postgres, duplicating all the data. Only once we’ve done extensive testing, and we can see it’s working ok, will we remove the redundant copy

That’s kind of where we are now, we have just started this parallel running, and initial results are looking good.

THE BIG BREAK-UP

The next challenge is to continue to break apart our monolith of code, and create new services out of the pieces. Although all our new features are being built in Microservices, we still have the heart of the system in the former monolith. We’ve seen so many benefits to having Microservices, we’d like all our code to look like that. In some ways it’s a more daunting prospect than breaking up the database. This is a large quantity of tried and tested code that has been running in production for many years – breaking it up is not something you can do over a weekend!

We have to make this big change without any interruption to our production service, and we’ve thought carefully about what our strategy should be. One way to do a big risky change is to split it into a series of less-risky, smaller changes. The idea is that after every step in the break up, to run a battery of automated regression tests. The shorter the time the tests take to run, the smaller increments we can work with, and less risk of breaking anything. I’m personally pretty excited by this prospect. We’ve spent several years now building and improving our automated tests for Pagero Online, to the point where we feel pretty confident in taking on this challenge.

The other part of the strategy is to do the same as we have with the database migration. We’ll run both the old and new versions of the service in production for a while before we cut over to the new one. This should find any issues missed by the automated tests, without affecting any of our production traffic.

It’s going to be a real proof of how good our testing and deployment routines are. What kind of tests and deployment tools we’ve built, now that’s a topic for another blog post. If I’m lucky, I might even be telling you about the hip and trendy hot technologies that will be all over the agenda of the next Craft conference :-).

Posted by Emily Bache on 2016-11-16 at 07:48 under Experience Report.
Tags: continuous delivery, microservices
Comments Off on From Monolith to Microservices.

Coding Is Like Cooking