Email Subscription Form

Saturday, October 26, 2019

The Power of Not Knowing

Recently I saw a tweet from Ben Simo (@QualityFrog) that mentioned that he sometimes likes to practice what he calls "intentional ignorance"- where he doesn't read some of the documentation or code for a new feature to see what he can find while doing exploratory testing.  His tweet reminded me that I used to do this too!

I haven't done this in a while, because the team I work on is a great Agile team.  The testers are invited to the feature grooming sessions, each story has acceptance criteria written, and the developers do a feature handoff with the testers when each story is ready for testing.

But at previous companies, I was often given a story to test with no feature handoff and no acceptance criteria.  Sometimes the story wouldn't even have a description, and would have some cryptic title, like "Endpoint for search".  I would usually be annoyed by this, and I would ask for clarification, but I would first use it as an opportunity to do some exploratory testing while I had no pre-conceived notions what the feature could do or not do.   And while testing in this fashion, I would often find a bug, show it to the developer, and have him or her say, "Oh, it never even occurred to me to test the feature in that way."

Of course I don't want to go back to the days of cryptic story titles with no description!  But testing without knowing what the feature does can have some benefits:

  • You approach the application the same way a user would.  When your users see your new feature for the first time, they don't have the benefit of instructions.  By trying out the feature without knowing how it works, you could discover that an action button is hard to find, or that it's difficult to know what to do first on a multi-part form.  
  • You might try entering data that no one was expecting.  For example, there could be a form field where the date was supposed to be entered with month and day only, but you enter in the month, day, and year, which breaks the form.  
  • Without any instructions from the developer, you might think of other features to test the new feature with, besides those the developer thought of.  Those feature combinations might yield new bugs.

So how can we add these advantages back into our testing without skipping reading the acceptance criteria and having feature handoffs?  Here are a few ways:

  • Pair test with someone on another team.  At my company we have many teams, each of which often has no idea what the other teams are building.  Four times a year, the software testers get together in pairs where the two testers are from very different teams, and they swap applications and start testing.  This is a great way to find bugs and user experience issues!
  • When you start testing, spend some time just playing around with the new feature before writing a test plan.  By exploring in this way, you might come up with some unusual testing ideas.
  • After you've tested the acceptance criteria, take some time to think about what features might be used with the new feature.  What happens when you test them together?  For example, if you were testing a new page of data, you could test it with the global sort feature that already exists in your application.

Of course, there are also times where not knowing all the details about a feature is detrimental.  There have been times in my testing career where I tested a feature and completely missed something that the feature could do, because no one told me about it.  That's why I'm glad that we have acceptance criteria and feature handoffs.  But there are also times when not knowing can yield some of the most interesting bugs.

Saturday, October 19, 2019

Your Flaky Tests Are Destroying Trust

Anyone who has ever written an automated test has experienced test flakiness.  There are many reasons for flaky tests, including:
  • Environmental issues, such as the application being unavailable
  • Test data issues, where an expected value has been changed
  • UI issues, such as a popup window taking too long to appear

All of these reasons are valid explanations for flaky tests.  However, they are not excuses!  It should be your mission to have all of your automated tests pass every single day, except of course when an actual bug is present.

This is important not just because you want your tests to be reliable; it's important because when you have flaky tests, trust in you and in your team is eroded.  Here's why:

Flaky tests send the message that you don't care
Let's say you are the sole automation engineer on a team, and you have a bunch of flaky tests.  It's your job to write test automation that actually checks that your product is running correctly, and because your tests are flaky, your automation doesn't do that.  Your team may assume that this is because you don't care about whether your job is done properly.

Flaky tests make your team suspect your competence
An even worse situation than the previous example is one where your team simply assumes that you haven't fixed the flaky tests because you don't know how.  This further erodes their trust in you, which may spill over into other testing.  If you find a bug when you are doing exploratory testing your colleagues might not believe that you have a bug, because they think you are technically incompetent.

Flaky tests waste everyone's time
If you are part of a large company where each team contributes one part of an application, other teams will rely on your automation to determine whether the code they committed works with your team's code.  If your tests are failing for no reason, people on other teams will need to stop what they are doing and troubleshoot your tests.  They won't be pleased if they discover that there's nothing wrong with the app and your tests are just being flaky.

Flaky tests breed distrust between teams
If your team has a bunch of flaky tests that fail for no good reason, and you aren't actively taking steps to fix them, other teams will ignore your tests, and may also doubt whether your team can be relied upon.  In a situation like this, if Team B commits code and sees that Team A has failing tests, they may do nothing about it, and may not even ask Team A about the failures.  If there are tests that fail because there are real issues, your teams might not discover them until days later.

Flaky tests send a bad message your company's leadership 
There's nothing worse for a test team than to have test automation where only 80% (or less) of the tests pass on a daily basis.  This sends a message to management that either test automation is unreliable, or you are unreliable!

So, what can we do about flaky tests?  I'd like to recommend these steps:

1. Make a commitment to having 100% of your tests pass every day.  The only time a test should fail is if a legitimate bug is present.  Some might argue that this is an impossible dream, but it is one to strive for.  There is no such thing as perfect software, or perfect tests, but we can work as hard as we can to get as close as we can to that perfection.

2. Set up alerts that notify you of test failures.  Having tests that detect problems in your software doesn't help if no one is alerted when test failures happen.  Set up an alert system that will notify you via email or chat when a test is failing.  Also, make sure that you test your alert.  Don't assume that because the alert is in place it is automatically working.  Make a change that will cause a test to fail and check to see if you got the notification.

3. Investigate every test failure and find out why it failed.  If the failure wasn't due to a legitimate bug, what caused the failure?  Will the test pass if you run it again, or does it fail every time?  Will the test pass if you run it manually?  Is your test data correct?  Are there problems with the test environment?

4. Remove the flaky tests.  Some might argue that this is a bad idea because you are losing test coverage, and the test passes sometimes.  But this doesn't matter, because when people see that the test is flaky they won't trust it anyway.  It's better to remove the flaky tests altogether so that you demonstrate that you have a 100% passing rate, and others will begin to trust your tests.

An alternative would be to set the flaky tests to be skipped, but this might also erode trust.  People might see all the skipped tests and see them as a sign that you don't write good test automation.  Furthermore, you might forget to fix the skipped tests.

5. Fix all the flaky tests you can.  How you fix the flaky tests will depend on why they are flaky.  If you have tests that are flaky because someone keeps changing your test data, change your tests so that the test data is set up in the test itself.  If you have tests that are flaky because sometimes your test assets aren't deleted at the end of the test, do a data cleanup both before and after the test.

6. Ask for help.  If your tests are flaky because the environment where they are running is unreliable, talk to the team that's responsible for maintaining the environment.  See if there's something they can to do solve the problem.  If they are unresponsive, find out if other teams are experiencing the issue, and lobby together to make a change.

7. Test your functionality in a different way.  If your flaky test is failing because of some element on the page that isn't loading on time, don't try to solve the issue by making your waits longer.  See if you can come up with a different way to test the feature.  For example, you might be able to switch that test to an API test.  Or you might be able to verify that a record was added in the database instead of going through UI.  Or you might be able to verify the data on a different page, instead of the one with the slow element.

Some might say that not testing the UI on that problematic page is dangerous.  But having a flaky test on this page is even more dangerous, because people will just ignore the test.  It would be better to stick with an automated test that works, and do an occasional manual test of that page.

Quality Automation is Our Responsibility

We've all been in situations where we have been dismissed as irrelevant or incompetent because of the reputation of a few bad testers.  Let's create a culture of excellence for testers everywhere by making sure that EVERY test we run is reliable and provides value!

Saturday, October 12, 2019

Why You Should Be Testing in Production

This is a true story; I'm keeping the details vague to protect those involved.  Once there was a software team that was implementing new functionality.  They tested the new functionality in their QA environment, and it worked just fine.  So they scheduled a deployment: first to the Staging environment, then to Production.  They didn't have any automated tests for the new feature, because it was tricky to automate.  And they didn't bother to do any manual tests in Staging or Production, reasoning that if it worked in the QA environment, it must work everywhere.

You can probably guess what happened next- they started getting calls from customers that the new feature didn't work.  They investigated and found that this was true.  Then they tried out the feature in the Staging environment and found that it didn't work there either.  As it turned out, the team had used hard-coded configuration strings that were only valid in the QA environment.  If they had simply done ONE test in the Staging or Production environment, they would have noticed that something was wrong.  Instead, it was left to the customers to notice the problem.

There are two main reasons why things that work in a QA environment don't work in a Production environment:

1) Configuration problems- This is what happened with the team described above.  Software is complicated, and there are often multiple servers and databases that need to talk to each other in order for the software to work properly.  Keeping software secure means that each part of the application needs to be protected by passwords or other configuration strings.  If any one of those strings is incorrect, the software won't work completely.

2) Deployment problems- In this age of microservices, deploying software usually means deploying several different APIs.  In a large organization, there may be different teams responsible for different APIs.  For example, when a new feature in API A needs the new code in API B to work properly, API B will need to be deployed first.  It's possible that Team B will forget to deploy API B or not even realize that it needs to be deployed.  In cases like this, Team A might assume that API B had been deployed, and they will go ahead and deploy API A.  Without testing, Team A will have no way of knowing that the new feature isn't working.

By running tests in every environment, you can quickly discover if you have configuration or deployment problems.  It's often not necessary to go through extensive testing of a new feature in Production if you've already tested it in QA, but it is vital that you do at least SOME testing to verify that it's working!  We never want to have our customers find problems before we do.

Saturday, October 5, 2019

Confused? Simplify!

As testers, we are often asked to test complex systems.  Gone are the days when testers were simply asked to fill out form fields and hit the Submit button; now we are testing data stores, cloud servers, messaging services, and much more.  When so many building blocks are used in our software, it can become easy to get overwhelmed and confused.  When this happens, it's best to simplify what we are testing until our situation becomes clear.

Here's an example that happened recently on my team: we were testing that push notifications of a specific type were working on an iPhone.  One of my teammates was triggering a push notification, but it wasn't appearing on the phone.  What could be wrong?  Maybe notifications were completely broken.  Maybe they were broken on the iPhone.  Maybe only this specific notification was broken.  Maybe only notifications of this type were broken.  In a situation where there are a lot of notifications to test and we are working on a deadline, this can become very confusing. 

So, we simplified by asking a series of questions and running a test for each one.  We started with:
Is this push notification working on an Android phone?
We triggered the same notification to go to an Android phone, and the push was delivered.  So we ruled out that the notification itself was broken.

Next, we asked:
Is this push notification working on any other iPhone?
We triggered the same notification to go to a different iPhone, and the push was delivered.  So we ruled out that the notification was broken on iOS devices.

Then we asked:
Is ANY notification working on this specific iPhone? 
We triggered some different notifications to go to the iPhone, and no pushes were delivered.  So we concluded that the problem was not with the notification, or with the push service; the problem was with the phone.

In taking a step back and asking three simple questions, we were able to quickly diagnose the problem.  Let's take a look at another example, using my hypothetical feature called the Superball Sorter, which sorts small and large colored balls among four children, as described in this post.

Let's imagine that we are testing a scenario where we are sorting the balls by both size and color.  We have the children set up with the following rules:
Amy gets only large balls
Bob gets only small purple balls and large red balls
Carol gets only small balls
Doug gets only green balls

When we run the sorter, a small purple ball is next in the sorting process, and it's Bob's turn to get a ball.  We are expecting that Bob is going to get the small purple ball because his sorting rules allow it, but he doesn't get the ball- it goes to Carol instead.  What could be wrong here?  Maybe Bob isn't getting any balls.  Maybe the purple ball isn't being sorted at all.  Maybe only the small balls aren't being sorted.  How can we figure out what is going on?

Our first question will be:
Can Bob get ANY sorted balls?  
We'll set up the sorter so Amy, Carol, and Doug only get large balls, and Bob only gets small balls.  We run the sorter, and Bob gets all the small balls.  So we know this isn't the problem.

Can anyone get the small purple ball?
Next, we'll set up the sorter so that Amy will only get small purple balls, and Bob, Carol, and Doug can get any ball at all.  We'll set up our list of balls so that the small purple ball is first on the list.  When we start our sorting process with Amy, she gets the small purple ball.  So now we know that the small purple ball isn't the problem.

Can Bob get the small purple ball in some other scenario?
We saw in our initial test that Bob wasn't getting the small purple ball, but can he EVER get that ball?  We'll set up our rules so that Amy will only get large balls, and Bob will get only small purple balls.  We won't give Carol and Doug any rules. Then we'll set up our list of balls so that the small purple ball is first on the list.  Amy won't get the small purple ball, because she only gets large balls, so the small purple ball is offered to Bob.  He gets the ball, so now we know that Bob can get the small purple ball in some scenarios.

At this point, we know that the problem is not the small purple ball.  What is different between the original scenario and the one we just ran?  One difference is that in the original scenario, all four children had a rule.  So let's ask this question:

Can Bob get the small purple ball when it's his only rule, and the other children all have rules?
We'll set up the rules like this:
Amy gets only large balls
Bob gets only small purple balls
Carol gets only small balls
Doug gets only green balls
We again set up our list of balls so that the small purple ball is first on the list.  The ball skips Amy, because it doesn't meet her rule, and Bob gets the ball.  So now we know that the problem is not that all the children have rules.  So now the next logical question is:

What happens when Bob has TWO rules?
We'll set up the rules like this:
Amy gets only large balls
Bob gets only small purple balls and small yellow balls
Carol gets only small balls
Doug gets only green balls

Our list of balls is the same, where the small purple ball is first.  This time, the ball skips Amy AND Bob, and Carol gets the small purple ball.

AHA!  Now we have a good working theory: when Bob has two rules, the sorting is not working correctly.  We can test out this theory by giving another child two rules, while giving everyone else one rule.  Are the balls sorted correctly?  What about when a child has two rules that specify color only and not size?  Will the two rules work then?  By continuing to ask questions, we can pinpoint precisely what the bug is.

By making your tests as simple as possible, you are able to narrow down the possibilities of where the bug is.  And by proceeding methodically and logically, you will be able to find that bug as quickly as possible, even in a very complex system.  

New Blog Location!

I've moved!  I've really enjoyed using Blogger for my blog, but it didn't integrate with my website in the way I wanted.  So I&#...