Releasing? No big deal.

Experience Reports

I’m jogging around in the park when my music suddenly stops and my phone vibrates.

– “Hi, I’m a bit… out of breath, how… can I help?”.

— “We have an issue in production. Our new release is eating away all the server   resources of the server, interfering with another product.”.

– “Ok. What can we do about that?”

— “We need to shut it down.”

– “Fine, shut it down.”

20190804_134955

Biking around the Mt. Blanc

There was a bit more to this conversation, as I wanted to know whether anyone had noticed our product’s malfunction and whether they’d notice if we would shut it down. Plus I wanted to know what would happen with the sending part of the system once we shut down the receiving part. It was a very short conversation though, where I made a quick and straightforward decision.

If this isn’t DevOps…

Why was this so easy? Why wasn’t I concerned/mad/fearful/…?
I was fine, because I understood the situation. 

I continued my run and after some time sent an email to the stakeholders explaining the situation. I started with an apology for the inconvenience, a congratulation for the great teamwork and communication and ended with the sentence
‘if this isn’t DevOps, I don’t know what is.’

The team had told me that the architecture had some possible malfunctions, that it might not scale well. But we had to learn exactly how badly it would be.
We decided to test this in Production. 

20190706_163416

Cat is not impressed.

Some context

We’re using the strangling pattern on already existing functionality. This means that we’re making a quasi-duplication of already existing systems, have them run in parallel and monitor the outcomes. When the outcomes are the same, or better, over a period of time, we know we have successfully replaced the old code. Then we can shut down the old legacy system and have our new one running as a cheaper, faster, better, more maintainable and future-proof system.

The first several releases are all about learning. ‘How do people react to a new, albeit badly, UI?’, ‘How much load does a production-like environment produce?’, ‘Can we integrate with the security partner’s software?’, ‘Can we display our new UI within the old UI without issues?’,…

Minimal Viable Experiments

We’ve learned a ton by experimenting with adding minimal changes and small valuable updates. We engineered the situation where this is possible. We have a management structure that supports this, yet is currently still a bit suspicious. We have the people who can make this happen.

We have the conversations, the vision, the strategy and the shared understanding.
I draw ugly pictures like this:

DWSB.png

What our next release strategy looks like. Simplified.

These ugly drawings don’t make sense when you look at them, but if you were part of the conversation, this is like a vacation picture where you fondly remember the moment and the outcome.

We don’t just talk about finishing User Stories. We talk about making an impact. Saving costs, making money, learning something new that will enable us to make the next move.

During our Blameless Post-Mortem we figured out that the architecture needs a drastic change. This change has been done now, which raises the question: ‘How shall we test this?’.
Remember the ‘sending part of our system’, that didn’t have a receiving end anymore? It amassed a big set of data over this past time and… well,… we could just point that to our new system and see what happens.

This is one of the parts of my job I really love. Strategising, discussing and thinking outside of the box to do a good job, doing it stress-free and doing it with people you enjoy working with.

20190602_173228

Complex structures: Monorail in Wuppertal

We have principles such as:
Bulkhead: Where you have different systems run on separate resources, so that they don’t interfere with each other. The only other impacted product was also an experimenting one.
Feature toggles: Virtually, this is a switch you flip to turn a system, or piece of a system on and off, much like a light switch. *click*, now you see it. *click*, now you don’t.
Dark Launches: We use pilot users, whom we can monitor, talk to, gather feedback from and learn, before we launch to the rest of Production.
Strangler Pattern: Explained above, this helps mitigate much of the risks of developing something new into the old.
Shadowing: Running our new software in Production, but completely invisible for our users. It helps us analyse load, investigate the outputs and evaluating its value.

But also: Eventual consistency, Distributed systems, Logging, Monitoring, Alerting,…

Releasing doesn’t have to be a big deal,…

  1. If you carry them as a team.
  2. If you’re smart about them.
  3. If you create a shared understanding.

 

 

Why my User Stories suck

Experience Reports, Software Testing

I keep on hearing how Testers must make excellent PO’s, because they’re so awesome at specifying acceptance criteria. People tell me how tester-now-POs can mold their user stories in such a way that they must be perfectly understandable and unambiguous.

I’m sad to break it to you, but my user stories are nothing of the sort. I know full well that when I feel ‘done’ with them, they still suck.

Possibly equally surprising, testers with a lot more product knowledge than me were unable to lay bare where my stories, acceptance criteria,… were lacking.

20190401_170005

The hills of Brighton – Devil’s Dyke

How can you live with yourself?

I accept various degrees of uncertainty. I don’t know everything, and neither does anyone else. Typically, at least in my experience, a concept has to go through several phases and through different hands to improve, to grow, to become something valuable.

This is also why the distinction between a User Story and Requirement is made:

  • A Requirement is something unambiguous, 100% clear and never changes. (Reality often proves to be different)
  • A User Story is a placeholder for a lower level of detail, often a discussion between people of different roles.


The examples you often read in blog posts or books that say: ‘As a user, I want to be able to filter on “name” so that I can easily find the product I need’ are relatively easy and make up very little of the actual work. In real projects, user stories are seldom as straightforward.

Most of my backlog is filled with stories that cover learning, investigating, building a proof of concept,… for quality aspects outside of the purely ‘functional’.
Before you can start building a filter, you must first have a minimum amount of security, stability, performance, reliability, observability, user experience… especially if you want to go live frequently and get feedback as soon as possible.

 

Most of my backlog is filled with stories that cover learning, investigating, building a proof of concept,… for quality aspects outside of the purely ‘functional’.

My hands are usually not the first ones our concepts go through. I get them by discussing problems and ideas with stakeholders. This could be a customer, a team member, a different team, anyone who might need our help to achieve something.

The concepts can be in a variety of different states: from very detailed and exactly described so that it’s hard to think about them differently,… to so vague that they hardly make sense.

 

My hands are also definitely not the last ones our concepts go through. As I pass my concepts on to the team, an 8-headed monster tears it apart with their questions, remarks, ideas, alternatives, examples, negative uses,… Which usually results in a stronger, clearer and all-round better view of the goal. Several additional tasks and potentially stories are created with this new information.

Then it gets built, tested, released, used, …

At any of these stages a concept can fall apart, need to be rethought, redefined, holes exposed, unexpected dependencies identified but also opportunities, ideas for innovation, strengthening, or additional business value found.

You have to start over so many times?

It can be frustrating, but this is exactly the strength a User Story has over requirements. They get better over time, they invite discussion and controversy and they result in better software.
From a PO perspective, it can be quite frustrating to get the feeling of having to ‘start over’. I’ve frequently thought ‘Why haven’t they thought of this SOONER?’, ‘why didn’t I think of this?!’.

We all learn new things over time. This is a core principle in agile software development. If we want to build better, we need to allow ourselves and the people around us to gain more and better insights over time.

Stories seldom end. Sometimes they are the beginning of something new, other times they have several spin-offs. The interesting ones, those that bring the most value, often sprout from several stories coming together.

They are not straightforward and that’s perfectly OK.

20190424_174703-1.jpg

Painting of a shipwreck in Church of Piedigrotta

What advantages does a previous tester have?

 

As a PO, I’m grateful for my previous experiences as a tester. Not because of my awesome acceptance criteria specification nor my powers of unambiguously writing down what a feature should do.
Instead, I’m grateful that during my tester years, I learned that:

  • Good software & software creators need time and several feedback-loops to be great.
  • Value is created by intellectual and creative people working together on a problem, not by following instructions to the letter.
  • My input should trigger discussion and controversy, rather than dismiss it.
  • Being able to hear solutions and problems from different people and asking the right questions.
  • Gathering and communicating information in a structured and understandable way to people with different backgrounds and expectations.

Combining these skills and knowledge with a vision & a strategy is how I enjoy being valuable as a PO.

20190420_154252-effects

Monastary in Swietokrzyskie, Poland

RiskStorming Experience Report: A Language Shift

Experience Reports, Software Testing, TestSphere
20190227_195530

RiskStorming @AgileActors; Athens, Greece

RiskStorming is a format that helps a whole development team figure out the answer to three important questions:

  • What is most important to our system?
  • What could negatively impact its success?
  • How do we deal with these possible risks?

At TestBash Brighton, Martin Hynie told me about another benefit:

It changes the language

The closer we can bring postmortem language into planning language, the closer we get having healthier discussions around what testing can offer and where other teams play a role in that ability.

Martin was so kind as to write down his experiences for me and allowing me the publish them:


Riskstorming as a tool to change the language around testing

One of the most fascinating things to observe is the language that is highly contextual when others speak of testing. Most specifically:

While a project is ongoing, AND a deadline approaches:

  • The general question around testing tends to be about reducing effort
  • “How much more testing is left?”
  • “When will testing be done?”
  • “How can we decrease the time needed for remaining testing?”
  • “Testing effort is a risk to the project deadline”

Once a project is done, or a feature is released… AND a bug is found in production:

  • “Why was this not caught by testing?”
  • “Testing is supposed to cover the whole product footprint?”
  • “Why did our testplan not include this scenario?”
  • “Who tested this? How did they decide what to test?”

There are two entirely different discussions here going on.

  • One views testing as overhead and process to be overcome… because risk somehow discrete is something to be mitigated. This is false, but accepting uncertainty is a hard leap for project planning.
  • The second is retrospective and viewing any failure as a missed step. Suddenly the pressure is an expectation that any bug should have been tested/caught, and the previous expectations and concerns around timelines feel insignificant, now that the team is facing the reality of the bug in production and the impact on customers and brand.

Riskstorming Experiment

By involving product, engineers and engineering management in RiskStorming questions, we were able to reframe planning in the following manner:

  • Where are the areas of uncertainty and risk?
  • What are the ranges of types of issues and bugs that might come from these areas?
  • How likely are these sorts of issues? Given the code we are touching… given our dependencies… given our history… given how long it has been since we last truly explored this area of the code base…
  • How bad could such an issue be? Which customers might be impacted? How hard could it be to recover? How likely are we to detect it?
  • Engineers get highly involved in this discussion… If such an issue did exist, what might we need to do to explore and discover the sorts of bugs we are discussing? How much effort might be needed to safely isolate and fix such issues without impacting the release? What about after the release?

Then we get to the magic question…

Now that we accept that in fact these risks are real because of trade-offs being made on schedule pressure vs testing (and not magically mitigated…):

If THIS issue happened in production, do we feel we can defend

  • Our current schedule,
  • Our strategy for implementation,
  • Our data, and environments for inspecting our solution,
  • Our decision on what is enough exploration and testing
    when our customers ask: “How did testing miss this?

What was interesting, is that suddenly, we were using the same language around testing before the release, that we only ever used after we released knowing that a bug actually happened in production. We used language around uncertainty. We starting using language around the reality that bugs will emerge. We started speaking around methods to perform the implementation that might help us make better use of testing in order to prioritize our time around the sort of issues that we could not easily detect or recover from.

We started speaking a language that really felt inclusive around shared responsibility, quality and outcomes.

I only have one data point involving RiskStorming… but took a similar approach with another team simply with interviewing engineers, and reporting on uncertainty, building a better sense or reality on trade-offs regarding these uncertainties, and options to reduce uncertainty. It had similar positive outcomes as RiskStorming, however required MUCH more explaining and convincing.

 


Martin Hynie

MartinhynieWith over fifteen years of specialization in software testing and development, Martin Hynie’s attention has gradually focused towards embracing uncertainty, and redefining testing as a critical research activity. The greatest gains in quality can be found when we emphasize communication, team development, business alignment and organizational learning.

A self-confessed conference junkie, Martin travels the world incorporating ideas introduced by various sources of inspiration (including Cynefin, complexity theory, context-driven testing, the Satir Model, Pragmatic Marketing, trading zones, agile principles, and progressive movement training) to help teams iteratively learn, to embrace failures as opportunities and to simply enjoy working together.

A TestSphere Expansion

Software Testing, TestSphere

Let’s begin with a special thanks to Benny & Marcel. Where would we ever be without the good help of smart people?

BandM.png

Benny & Marcel making a case for Testing in Production


It’s been 2 years since we launched TestSphere: A card deck that helps testers and non-testers think & talk about Testing.
People keep coming up with wonderful ideas on how to further improve the card deck. Expansions, translations, errata,…

A Security deck! A Performance deck! A Usability deck! An Automation deck!
Well… yes. The possibilities are huge, but it needs to make sense too: Value-wise & business-wise.
The thing TestSphere does extremely well is twofold: Spark Ideas and Spark Conversation – Thinking & Talking

Maja

Maja being ‘Business Manager’ for a RiskStorming workshop for DevBridge, Kaunas

RiskStorming is developing to become an incredibly valuable format. It combines the two aspects of TestSphere perfectly.
In its essence it makes your whole team have a structured conversation about quality that is loaded with new ideas and strategies. To be blunt: It helps testers figure out what they are being paid for and it helps non-testers find out why they have testers in the first place.

It’s the learnings and insights of having run continuous RiskStorming workshops for many different businesses in many different contexts that drive the new TestSphere expansion.

The creation of an expansion is driven, not by novelty, but from a clear need.

I present you here the first iteration of all new concepts on the cards. No Explanations or Examples yet. We’ll keep the iterations lean. If you have feedback, you can find me on ‘All the channels’.

Five New Cards Per Dimension

In the first version we had 20 cards per dimension. We noticed that some important cards were missing. The new expansion will cover these.

  • Heuristics: Possible ways of tackling a problem.
    • Dogfooding
    • Stress Testing
    • Chaos Engineering
    • Three Amigo’s
    • Dark Launch

+

  • Techniques: Clever activities we use in our testing to find possible problems.
    • OWASP Top Ten
    • Peer Reviews
    • Mob Testing
    • Feature Toggles
    • Test Driven Development

+

  • Feelings: Every feeling that was triggered by your testing should be handled as a fact.
    • Informed
    • Fear
    • Overwhelmed
    • Excited
    • Unqualified

+

  • Quality Aspects: Possible aspects of your application that may be of interest.
    • Observability
    • Measureability
    • Business Value Capability
    • Scalability
    • Availability

+

  • Patterns: Patterns in our testing, but also patterns that work against us, while testing such as Biases.
    • Single Responsibility Principle
    • Story Slicing
    • Mutation Testing
    • Strangling Patterns
    • Long Term Load testing

+

Two New Dimensions

Dimensions are the aspects of the cards that are divided by represented colors. We felt like some important dimensions were missing. Both of these are mainly operations related, a not to be underestimated part of testing.

Hardening: (working title) Concepts that improve the underlying structures of your software. Compare this dimension to muscle building – You need to strain your muscles until the weak parts get small tears, the tissue can then regenerate and build a stronger, more robust muscle. We test, exercise and strain the product so that we can fill the cracks with smarter ideas, better code and stronger software.

  1. Blameless Post Mortem
  2. Service Level Objectives/Agreements
  3. Anti-Corruption Layer
  4. Circuit Breaker
  5. Bulkhead
  6. Caching
  7. Distributed systems
  8. Federated Identity
  9. Eventual Consistency
  10. API Gateway
  11. Container Security Scanning
  12. Static Code Analysis
  13. Infrastructure as Code
  14. Config as Code
  15. Separation of Concerns
  16. Command Query Responsibility Segregation
  17. Continuous Integration
  18. Continuous Delivery
  19. Consumer Driven Contract Testing
  20. Pre Mortem

Post-Release: (working title) Tactics, approaches, techniques,… that improve the ability to see what’s going on – and orchestrating safe changes in your application’s production environment. When something goes wrong, goes well, brings in money, throws an error, becomes slow,… You can see it and its results.

  1. Fault Injection
  2. Logging
  3. Distributed Tracing
  4. Alerting
  5. Anomaly Detection
  6. Business Metrics
  7. Blackbox Monitoring
  8. Whitebox Monitoring
  9. Event Sourcing
  10. Real User Monitoring
  11. Tap Compare
  12. Profiling
  13. Dynamic Instrumentation
  14. Traffic Shaping
  15. Teeing
  16. On-Call Experience
  17. Shadowing
  18. Zero Downtime
  19. Load Balancing
  20. Config Change Testing

Wrapping up

I’m out of my water here. There’s so much I need to investigate, learn, put into words for myself before I can make it into a valuable tool for you. I welcome any feedback.
Thank you for being such an amazing part of this journey already.

Knowit

The winning team of the RiskStorming workshop at TestIT in Malmö

A PO’s View on Releasing

Uncategorized

… and what that means for the tester.

I’m still in the Healthcare project where we’re strangling small pieces of functionality out of a big monolith of software that was built over the course of 20+ years.
Next month, we’ll put our first real strangled functionality into production. In a docker container. With horrible UI. Yes, releasing is a matter of months for us. That’s not what I’m worried about right now.

The past week, I noticed myself saying: “I don’t care about quality.” and “We don’t need to fix that logical gap.”. Things I wouldn’t be able to reconcile myself with when I was a tester. My view on releasing, and the quality of said release has changed as well.

That ‘I don’t care about quality’ was mainly to make a point. It wasn’t correct. I just care less about certain aspects. The UI is not well structured, black and white and far from appealing. But everything we need is there. Looking at our new screen, any user will curl their upper lip.

Additionally, when you open a (closed) file from last year, you’ll get a blank screen. No user friendly message, apology or workaround. It’s not optimal, but it’s just not important to me and I’ll gladly defend that to anyone.

What does interest me are two very different aspects: Security and Performance.
(This is by the way why RiskStorming is such a great workshop to do in the beginning of such a project. As this information shouldn’t come as a surprise to anybody.)

Positive User Feedback is currently a low priority for me. We’re doing a very minor change when it comes to our users. However, the change touches upon some very important data. Data you don’t want to have stolen. That’s my first priority.
The second one is successfully working closer with our Operations department. The better we can make that collaboration, the easier we’ll have it in the future.
A third priority is making it run smoothly in production at all. If it’s a flop, it’s merely a flip of a parameter to revert.

These kind of decisions don’t make me particularly popular with the testers in the team.
I apologise for this, yet they are the product of careful listening and thinking.

20190117_092706

Skiing with Team members

What does this mean for Testers?

In the aforementioned context, it can be daunting to be a tester. They might feel out of the water concerning expected expertise and they might feel discouraged because many issues are dismissed.

Consider this: I’m currently laying tracks for a train to cross America East to West around the 1860’s. I expect it to be bumpy. I expect troubles. The passengers should have a good chance of arriving at the other side, but in the face of danger, we should be able to turn back. The testers can’t safeguard this project from bandits, explosions, landslides,… They also can’t simulate half of these things or more as they lack expertise, resources,…
However, they can gather information on as many potential points of failure as possible and provide possible alternatives.

They might not know Security Testing or Performance Testing, but they know Risks and have the skills to identify weaknesses.
I expect them to use their words and arguments to do everything in their power to make sure I abandon the project as a whole.
… and have the patience, respect and open mind to accept unpopular decisions made by the PO.
Not that this has been an issue as of late, but I clearly remember my own frustration when faced with what my testers are facing now. I guess it’s never too late for some lessons in humility.

20190119_122942