I’m jogging around in the park when my music suddenly stops and my phone vibrates.
– “Hi, I’m a bit… out of breath, how… can I help?”.
— “We have an issue in production. Our new release is eating away all the server resources of the server, interfering with another product.”.
– “Ok. What can we do about that?”
— “We need to shut it down.”
– “Fine, shut it down.”
There was a bit more to this conversation, as I wanted to know whether anyone had noticed our product’s malfunction and whether they’d notice if we would shut it down. Plus I wanted to know what would happen with the sending part of the system once we shut down the receiving part. It was a very short conversation though, where I made a quick and straightforward decision.
If this isn’t DevOps…
Why was this so easy? Why wasn’t I concerned/mad/fearful/…?
I was fine, because I understood the situation.
I continued my run and after some time sent an email to the stakeholders explaining the situation. I started with an apology for the inconvenience, a congratulation for the great teamwork and communication and ended with the sentence
‘if this isn’t DevOps, I don’t know what is.’
The team had told me that the architecture had some possible malfunctions, that it might not scale well. But we had to learn exactly how badly it would be.
We decided to test this in Production.
We’re using the strangling pattern on already existing functionality. This means that we’re making a quasi-duplication of already existing systems, have them run in parallel and monitor the outcomes. When the outcomes are the same, or better, over a period of time, we know we have successfully replaced the old code. Then we can shut down the old legacy system and have our new one running as a cheaper, faster, better, more maintainable and future-proof system.
The first several releases are all about learning. ‘How do people react to a new, albeit badly, UI?’, ‘How much load does a production-like environment produce?’, ‘Can we integrate with the security partner’s software?’, ‘Can we display our new UI within the old UI without issues?’,…
Minimal Viable Experiments
We’ve learned a ton by experimenting with adding minimal changes and small valuable updates. We engineered the situation where this is possible. We have a management structure that supports this, yet is currently still a bit suspicious. We have the people who can make this happen.
We have the conversations, the vision, the strategy and the shared understanding.
I draw ugly pictures like this:
These ugly drawings don’t make sense when you look at them, but if you were part of the conversation, this is like a vacation picture where you fondly remember the moment and the outcome.
We don’t just talk about finishing User Stories. We talk about making an impact. Saving costs, making money, learning something new that will enable us to make the next move.
During our Blameless Post-Mortem we figured out that the architecture needs a drastic change. This change has been done now, which raises the question: ‘How shall we test this?’.
Remember the ‘sending part of our system’, that didn’t have a receiving end anymore? It amassed a big set of data over this past time and… well,… we could just point that to our new system and see what happens.
This is one of the parts of my job I really love. Strategising, discussing and thinking outside of the box to do a good job, doing it stress-free and doing it with people you enjoy working with.
We have principles such as:
Bulkhead: Where you have different systems run on separate resources, so that they don’t interfere with each other. The only other impacted product was also an experimenting one.
Feature toggles: Virtually, this is a switch you flip to turn a system, or piece of a system on and off, much like a light switch. *click*, now you see it. *click*, now you don’t.
Dark Launches: We use pilot users, whom we can monitor, talk to, gather feedback from and learn, before we launch to the rest of Production.
Strangler Pattern: Explained above, this helps mitigate much of the risks of developing something new into the old.
Shadowing: Running our new software in Production, but completely invisible for our users. It helps us analyse load, investigate the outputs and evaluating its value.
But also: Eventual consistency, Distributed systems, Logging, Monitoring, Alerting,…
Releasing doesn’t have to be a big deal,…
- If you carry them as a team.
- If you’re smart about them.
- If you create a shared understanding.