Releasing? No big deal.

Experience Reports

I’m jogging around in the park when my music suddenly stops and my phone vibrates.

– “Hi, I’m a bit… out of breath, how… can I help?”.

— “We have an issue in production. Our new release is eating away all the server   resources of the server, interfering with another product.”.

– “Ok. What can we do about that?”

— “We need to shut it down.”

– “Fine, shut it down.”

20190804_134955

Biking around the Mt. Blanc

There was a bit more to this conversation, as I wanted to know whether anyone had noticed our product’s malfunction and whether they’d notice if we would shut it down. Plus I wanted to know what would happen with the sending part of the system once we shut down the receiving part. It was a very short conversation though, where I made a quick and straightforward decision.

If this isn’t DevOps…

Why was this so easy? Why wasn’t I concerned/mad/fearful/…?
I was fine, because I understood the situation. 

I continued my run and after some time sent an email to the stakeholders explaining the situation. I started with an apology for the inconvenience, a congratulation for the great teamwork and communication and ended with the sentence
‘if this isn’t DevOps, I don’t know what is.’

The team had told me that the architecture had some possible malfunctions, that it might not scale well. But we had to learn exactly how badly it would be.
We decided to test this in Production. 

20190706_163416

Cat is not impressed.

Some context

We’re using the strangling pattern on already existing functionality. This means that we’re making a quasi-duplication of already existing systems, have them run in parallel and monitor the outcomes. When the outcomes are the same, or better, over a period of time, we know we have successfully replaced the old code. Then we can shut down the old legacy system and have our new one running as a cheaper, faster, better, more maintainable and future-proof system.

The first several releases are all about learning. ‘How do people react to a new, albeit badly, UI?’, ‘How much load does a production-like environment produce?’, ‘Can we integrate with the security partner’s software?’, ‘Can we display our new UI within the old UI without issues?’,…

Minimal Viable Experiments

We’ve learned a ton by experimenting with adding minimal changes and small valuable updates. We engineered the situation where this is possible. We have a management structure that supports this, yet is currently still a bit suspicious. We have the people who can make this happen.

We have the conversations, the vision, the strategy and the shared understanding.
I draw ugly pictures like this:

DWSB.png

What our next release strategy looks like. Simplified.

These ugly drawings don’t make sense when you look at them, but if you were part of the conversation, this is like a vacation picture where you fondly remember the moment and the outcome.

We don’t just talk about finishing User Stories. We talk about making an impact. Saving costs, making money, learning something new that will enable us to make the next move.

During our Blameless Post-Mortem we figured out that the architecture needs a drastic change. This change has been done now, which raises the question: ‘How shall we test this?’.
Remember the ‘sending part of our system’, that didn’t have a receiving end anymore? It amassed a big set of data over this past time and… well,… we could just point that to our new system and see what happens.

This is one of the parts of my job I really love. Strategising, discussing and thinking outside of the box to do a good job, doing it stress-free and doing it with people you enjoy working with.

20190602_173228

Complex structures: Monorail in Wuppertal

We have principles such as:
Bulkhead: Where you have different systems run on separate resources, so that they don’t interfere with each other. The only other impacted product was also an experimenting one.
Feature toggles: Virtually, this is a switch you flip to turn a system, or piece of a system on and off, much like a light switch. *click*, now you see it. *click*, now you don’t.
Dark Launches: We use pilot users, whom we can monitor, talk to, gather feedback from and learn, before we launch to the rest of Production.
Strangler Pattern: Explained above, this helps mitigate much of the risks of developing something new into the old.
Shadowing: Running our new software in Production, but completely invisible for our users. It helps us analyse load, investigate the outputs and evaluating its value.

But also: Eventual consistency, Distributed systems, Logging, Monitoring, Alerting,…

Releasing doesn’t have to be a big deal,…

  1. If you carry them as a team.
  2. If you’re smart about them.
  3. If you create a shared understanding.

 

 

Why my User Stories suck

Experience Reports, Software Testing

I keep on hearing how Testers must make excellent PO’s, because they’re so awesome at specifying acceptance criteria. People tell me how tester-now-POs can mold their user stories in such a way that they must be perfectly understandable and unambiguous.

I’m sad to break it to you, but my user stories are nothing of the sort. I know full well that when I feel ‘done’ with them, they still suck.

Possibly equally surprising, testers with a lot more product knowledge than me were unable to lay bare where my stories, acceptance criteria,… were lacking.

20190401_170005

The hills of Brighton – Devil’s Dyke

How can you live with yourself?

I accept various degrees of uncertainty. I don’t know everything, and neither does anyone else. Typically, at least in my experience, a concept has to go through several phases and through different hands to improve, to grow, to become something valuable.

This is also why the distinction between a User Story and Requirement is made:

  • A Requirement is something unambiguous, 100% clear and never changes. (Reality often proves to be different)
  • A User Story is a placeholder for a lower level of detail, often a discussion between people of different roles.


The examples you often read in blog posts or books that say: ‘As a user, I want to be able to filter on “name” so that I can easily find the product I need’ are relatively easy and make up very little of the actual work. In real projects, user stories are seldom as straightforward.

Most of my backlog is filled with stories that cover learning, investigating, building a proof of concept,… for quality aspects outside of the purely ‘functional’.
Before you can start building a filter, you must first have a minimum amount of security, stability, performance, reliability, observability, user experience… especially if you want to go live frequently and get feedback as soon as possible.

 

Most of my backlog is filled with stories that cover learning, investigating, building a proof of concept,… for quality aspects outside of the purely ‘functional’.

My hands are usually not the first ones our concepts go through. I get them by discussing problems and ideas with stakeholders. This could be a customer, a team member, a different team, anyone who might need our help to achieve something.

The concepts can be in a variety of different states: from very detailed and exactly described so that it’s hard to think about them differently,… to so vague that they hardly make sense.

 

My hands are also definitely not the last ones our concepts go through. As I pass my concepts on to the team, an 8-headed monster tears it apart with their questions, remarks, ideas, alternatives, examples, negative uses,… Which usually results in a stronger, clearer and all-round better view of the goal. Several additional tasks and potentially stories are created with this new information.

Then it gets built, tested, released, used, …

At any of these stages a concept can fall apart, need to be rethought, redefined, holes exposed, unexpected dependencies identified but also opportunities, ideas for innovation, strengthening, or additional business value found.

You have to start over so many times?

It can be frustrating, but this is exactly the strength a User Story has over requirements. They get better over time, they invite discussion and controversy and they result in better software.
From a PO perspective, it can be quite frustrating to get the feeling of having to ‘start over’. I’ve frequently thought ‘Why haven’t they thought of this SOONER?’, ‘why didn’t I think of this?!’.

We all learn new things over time. This is a core principle in agile software development. If we want to build better, we need to allow ourselves and the people around us to gain more and better insights over time.

Stories seldom end. Sometimes they are the beginning of something new, other times they have several spin-offs. The interesting ones, those that bring the most value, often sprout from several stories coming together.

They are not straightforward and that’s perfectly OK.

20190424_174703-1.jpg

Painting of a shipwreck in Church of Piedigrotta

What advantages does a previous tester have?

 

As a PO, I’m grateful for my previous experiences as a tester. Not because of my awesome acceptance criteria specification nor my powers of unambiguously writing down what a feature should do.
Instead, I’m grateful that during my tester years, I learned that:

  • Good software & software creators need time and several feedback-loops to be great.
  • Value is created by intellectual and creative people working together on a problem, not by following instructions to the letter.
  • My input should trigger discussion and controversy, rather than dismiss it.
  • Being able to hear solutions and problems from different people and asking the right questions.
  • Gathering and communicating information in a structured and understandable way to people with different backgrounds and expectations.

Combining these skills and knowledge with a vision & a strategy is how I enjoy being valuable as a PO.

20190420_154252-effects

Monastary in Swietokrzyskie, Poland

RiskStorming Experience Report: A Language Shift

Experience Reports, Software Testing, TestSphere
20190227_195530

RiskStorming @AgileActors; Athens, Greece

RiskStorming is a format that helps a whole development team figure out the answer to three important questions:

  • What is most important to our system?
  • What could negatively impact its success?
  • How do we deal with these possible risks?

At TestBash Brighton, Martin Hynie told me about another benefit:

It changes the language

The closer we can bring postmortem language into planning language, the closer we get having healthier discussions around what testing can offer and where other teams play a role in that ability.

Martin was so kind as to write down his experiences for me and allowing me the publish them:


Riskstorming as a tool to change the language around testing

One of the most fascinating things to observe is the language that is highly contextual when others speak of testing. Most specifically:

While a project is ongoing, AND a deadline approaches:

  • The general question around testing tends to be about reducing effort
  • “How much more testing is left?”
  • “When will testing be done?”
  • “How can we decrease the time needed for remaining testing?”
  • “Testing effort is a risk to the project deadline”

Once a project is done, or a feature is released… AND a bug is found in production:

  • “Why was this not caught by testing?”
  • “Testing is supposed to cover the whole product footprint?”
  • “Why did our testplan not include this scenario?”
  • “Who tested this? How did they decide what to test?”

There are two entirely different discussions here going on.

  • One views testing as overhead and process to be overcome… because risk somehow discrete is something to be mitigated. This is false, but accepting uncertainty is a hard leap for project planning.
  • The second is retrospective and viewing any failure as a missed step. Suddenly the pressure is an expectation that any bug should have been tested/caught, and the previous expectations and concerns around timelines feel insignificant, now that the team is facing the reality of the bug in production and the impact on customers and brand.

Riskstorming Experiment

By involving product, engineers and engineering management in RiskStorming questions, we were able to reframe planning in the following manner:

  • Where are the areas of uncertainty and risk?
  • What are the ranges of types of issues and bugs that might come from these areas?
  • How likely are these sorts of issues? Given the code we are touching… given our dependencies… given our history… given how long it has been since we last truly explored this area of the code base…
  • How bad could such an issue be? Which customers might be impacted? How hard could it be to recover? How likely are we to detect it?
  • Engineers get highly involved in this discussion… If such an issue did exist, what might we need to do to explore and discover the sorts of bugs we are discussing? How much effort might be needed to safely isolate and fix such issues without impacting the release? What about after the release?

Then we get to the magic question…

Now that we accept that in fact these risks are real because of trade-offs being made on schedule pressure vs testing (and not magically mitigated…):

If THIS issue happened in production, do we feel we can defend

  • Our current schedule,
  • Our strategy for implementation,
  • Our data, and environments for inspecting our solution,
  • Our decision on what is enough exploration and testing
    when our customers ask: “How did testing miss this?

What was interesting, is that suddenly, we were using the same language around testing before the release, that we only ever used after we released knowing that a bug actually happened in production. We used language around uncertainty. We starting using language around the reality that bugs will emerge. We started speaking around methods to perform the implementation that might help us make better use of testing in order to prioritize our time around the sort of issues that we could not easily detect or recover from.

We started speaking a language that really felt inclusive around shared responsibility, quality and outcomes.

I only have one data point involving RiskStorming… but took a similar approach with another team simply with interviewing engineers, and reporting on uncertainty, building a better sense or reality on trade-offs regarding these uncertainties, and options to reduce uncertainty. It had similar positive outcomes as RiskStorming, however required MUCH more explaining and convincing.

 


Martin Hynie

MartinhynieWith over fifteen years of specialization in software testing and development, Martin Hynie’s attention has gradually focused towards embracing uncertainty, and redefining testing as a critical research activity. The greatest gains in quality can be found when we emphasize communication, team development, business alignment and organizational learning.

A self-confessed conference junkie, Martin travels the world incorporating ideas introduced by various sources of inspiration (including Cynefin, complexity theory, context-driven testing, the Satir Model, Pragmatic Marketing, trading zones, agile principles, and progressive movement training) to help teams iteratively learn, to embrace failures as opportunities and to simply enjoy working together.

A TestSphere Expansion

Software Testing, TestSphere

Let’s begin with a special thanks to Benny & Marcel. Where would we ever be without the good help of smart people?

BandM.png

Benny & Marcel making a case for Testing in Production


It’s been 2 years since we launched TestSphere: A card deck that helps testers and non-testers think & talk about Testing.
People keep coming up with wonderful ideas on how to further improve the card deck. Expansions, translations, errata,…

A Security deck! A Performance deck! A Usability deck! An Automation deck!
Well… yes. The possibilities are huge, but it needs to make sense too: Value-wise & business-wise.
The thing TestSphere does extremely well is twofold: Spark Ideas and Spark Conversation – Thinking & Talking

Maja

Maja being ‘Business Manager’ for a RiskStorming workshop for DevBridge, Kaunas

RiskStorming is developing to become an incredibly valuable format. It combines the two aspects of TestSphere perfectly.
In its essence it makes your whole team have a structured conversation about quality that is loaded with new ideas and strategies. To be blunt: It helps testers figure out what they are being paid for and it helps non-testers find out why they have testers in the first place.

It’s the learnings and insights of having run continuous RiskStorming workshops for many different businesses in many different contexts that drive the new TestSphere expansion.

The creation of an expansion is driven, not by novelty, but from a clear need.

I present you here the first iteration of all new concepts on the cards. No Explanations or Examples yet. We’ll keep the iterations lean. If you have feedback, you can find me on ‘All the channels’.

Five New Cards Per Dimension

In the first version we had 20 cards per dimension. We noticed that some important cards were missing. The new expansion will cover these.

  • Heuristics: Possible ways of tackling a problem.
    • Dogfooding
    • Stress Testing
    • Chaos Engineering
    • Three Amigo’s
    • Dark Launch

+

  • Techniques: Clever activities we use in our testing to find possible problems.
    • OWASP Top Ten
    • Peer Reviews
    • Mob Testing
    • Feature Toggles
    • Test Driven Development

+

  • Feelings: Every feeling that was triggered by your testing should be handled as a fact.
    • Informed
    • Fear
    • Overwhelmed
    • Excited
    • Unqualified

+

  • Quality Aspects: Possible aspects of your application that may be of interest.
    • Observability
    • Measureability
    • Business Value Capability
    • Scalability
    • Availability

+

  • Patterns: Patterns in our testing, but also patterns that work against us, while testing such as Biases.
    • Single Responsibility Principle
    • Story Slicing
    • Mutation Testing
    • Strangling Patterns
    • Long Term Load testing

+

Two New Dimensions

Dimensions are the aspects of the cards that are divided by represented colors. We felt like some important dimensions were missing. Both of these are mainly operations related, a not to be underestimated part of testing.

Hardening: (working title) Concepts that improve the underlying structures of your software. Compare this dimension to muscle building – You need to strain your muscles until the weak parts get small tears, the tissue can then regenerate and build a stronger, more robust muscle. We test, exercise and strain the product so that we can fill the cracks with smarter ideas, better code and stronger software.

  1. Blameless Post Mortem
  2. Service Level Objectives/Agreements
  3. Anti-Corruption Layer
  4. Circuit Breaker
  5. Bulkhead
  6. Caching
  7. Distributed systems
  8. Federated Identity
  9. Eventual Consistency
  10. API Gateway
  11. Container Security Scanning
  12. Static Code Analysis
  13. Infrastructure as Code
  14. Config as Code
  15. Separation of Concerns
  16. Command Query Responsibility Segregation
  17. Continuous Integration
  18. Continuous Delivery
  19. Consumer Driven Contract Testing
  20. Pre Mortem

Post-Release: (working title) Tactics, approaches, techniques,… that improve the ability to see what’s going on – and orchestrating safe changes in your application’s production environment. When something goes wrong, goes well, brings in money, throws an error, becomes slow,… You can see it and its results.

  1. Fault Injection
  2. Logging
  3. Distributed Tracing
  4. Alerting
  5. Anomaly Detection
  6. Business Metrics
  7. Blackbox Monitoring
  8. Whitebox Monitoring
  9. Event Sourcing
  10. Real User Monitoring
  11. Tap Compare
  12. Profiling
  13. Dynamic Instrumentation
  14. Traffic Shaping
  15. Teeing
  16. On-Call Experience
  17. Shadowing
  18. Zero Downtime
  19. Load Balancing
  20. Config Change Testing

Wrapping up

I’m out of my water here. There’s so much I need to investigate, learn, put into words for myself before I can make it into a valuable tool for you. I welcome any feedback.
Thank you for being such an amazing part of this journey already.

Knowit

The winning team of the RiskStorming workshop at TestIT in Malmö

Reflecting on the last project

Experience Reports, Software Testing

This is a post written by Geert van de Lisdonk about a project he worked 1,5 year on as a Test consultant.99-Geert.png

My last project was in the financial sector. The product we made was used by private banks. Our users were the Investment Managers of those banks. They are called the rock stars of the banking world. Those people didn’t have time for us. We could get little information from them via their secretaries or some meeting that was planned meticulously. And in that meeting only 1 or 2 of our people could be present, to make sure they didn’t spent too much time. Getting specs or finding out what to build and build the right thing was not an easy task. Our business analysts had their work cut out for them but did a very good job with the resources they had. Getting feedback from them was even more difficult. Especially getting it fast so we could change the product quickly. Despite all that, we were able to deliver a working product to all clients. This blogpost is a reflection on what I think I did well, what didn’t do well and what I would have changed if could have done it over.

 

What we did well

Handovers

One of the best things we did, in my opinion, were the handovers. Every time something was developed, a handover was done. This handover consists out of the developer showing what has been created to me, the tester, and the product owner.
This moment creates an opportunity for the PO to verify if the correct thing has been build or point out possible improvements.
As a tester this is a great source of information. With both the developer and the PO present,  all possible questions can be answered. Technical, functional and everything in between can be reviewed and corrected if necessary.

Groomings

Getting the tester involved early is always a good idea. When the Business Analysts had decided on what needed to be made, a grooming session was called to discuss how we could achieve this.
Most of the times there was already some kind of solution prepared by the Product Manager that would suit the needs of several clients. This general solution would then be discussed.

For me this was a moment I could express concern and point out risks. This information would also be a base for the tests I’d be executing.

Teamwork

The team I was in is what I would describe as a distributed team. We had team-members in Belgium, UK and 2 places in Italy. Working together wasn’t always easy. In the beginning most mass communication was done using emails sent to the entire team. This didn’t prove very efficient so we switched to using Microsoft Teams.

There was one main channel which we used the most. Some side channels were also set up that would be used for specific cases. People in the team were expected to have Teams open at all times. This sometimes didn’t happen and caused problems. It took some getting used to, but after a while I felt like we were doing a decent job!

whatsapp image 2019-01-21 at 10.59.45

 

What we could have done better

Retrospectives

When I first joined the team the stand-ups happened very ad-hoc. You could get a call between 9am-3pm or none at all. Instead a meeting was booked with a link to a group Skype conversation. Everybody was now expected to join this conversation at 10am for the stand-up. This was a great improvement! Every sprint we would start with a planning meeting and set out the work we were supposed to do.

But there were also ceremonies missing. At no point in time was there a sprint review or a retrospective. This meant that developers didn’t know from each other what had been finished or what the application is currently capable of.

The biggest missing ritual in my opinion was the retrospective. There was no formal way of looking at how we did things and discussing on how we could improve. Having a distributed team didn’t help here. Also the high pace we were try to maintain made it difficult. But if the PM would have pushed more for this, I think the team could have benefited a lot.

Unit testing

There was no incentive to write unit tests. So there were only a handful of them. Not because the developers didn’t want to. They even agreed that we should write them! There was just nobody waiting for them so they didn’t write them.
There were multiple refactorings of code that could have been improved with unit tests. Many bugs were discovered that wouldn’t have existed if only there were some unit tests written. But since nobody asked for it, and the pace was to high, no time was spent on them.

Less pressure

This project was ran at a fast pace. Between grooming and delivery were sometimes only 3 sprints. 1 for analysis, 1 for development, 1 for testing/fixing/deploying. This got us in trouble lots of time. When during development raised new questions or requirements emerged, there was little time for redirection. Luckily we were able to diminish the scope most of the time, but I still feel we delivered lower quality than we would have liked.

whatsapp image 2019-01-21 at 12.25.25

What I would have done differently

Reporting

Looking back, it was difficult for the PM to know exactly what I was doing. We used TFS to track our work, but it wasn’t very detailed. The stand-ups did provide some clarity, but only a partial message.

My testing was documented in a OneNote on the SharePoint, so he technically could verify what I was doing. Although admittedly this would require a lot of energy from him.
I think he would have preferred pass/fail test cases, but I didn’t deem that feasible with the high pace we were trying to maintain.
In hindsight I could have delivered weekly reports or sprint reports of what was done and what issues were found or resolved. This would would of course take some time at the end of the sprint, that could be an issue. I did look for a decent way to report on my testing but never found a format that suited me.

Fix more bugs myself

We were working CRM Dynamics that was altered to fit the needs of our customers. Both the system and the product were built in such a way that most setting could be altered in the UI. It took me a while to learn how these things worked but managed to resolve bug myself. Sometimes I didn’t know how to resolve them in the UI. I would then take this opportunity and have the developers explain to me how to resolve it next time I encounter something similar.

Since the framework restricted us in some ways, we also made use of a C# middleware to deal with more complex things. The middleware issues were harder for me to resolve so I don’t think I would be able to fix those by myself. The middleware developers being in Italy also complicated things. Pairing on the bug fixes could have taught me a lot. This did happen from time to time, but not frequently enough so I could dive in and sort things out myself.
Additionally, having more insights into the application would have been a nice luxury to have. Through tools such as ‘Dynatrace’, Application Insights,… I could have provided more information to the developers.

whatsapp image 2019-01-21 at 10.59.45 (1)

To summarize

Despite the high pace this project was ran, we still managed to do very good things. The people on the development team were very knowledgeable and taught me a lot. Sure there were some things that I would have liked to change, but that will always be the case. To me the biggest problem was that we didn’t reflect on ourselves. This meant we stagnated on several levels and only grew slowly as a team and a product.
I learned that I value (self-)reflection a lot. More than I previously knew. I started looking for other ways to reflect. At DEWT I got advised to start a journal for myself. This is something I plan on doing for my next project. Currently I have a notebook that I use for all testing related events. Not yet a diary, but a start of this self-reflection habit!
I also learned how I like to conduct my testing and where problems might have risen there. Focus on people and software instead of documentation. I would add some kind of reporting to show of my work. I’ve been looking in good ways to do this report, but am yet to find a format that suits me.