Fool Me Once: Winning the War on Flaky Tests

Everybody hates flaky tests. Lurking in the shadows, they result in confusion and waste your time. So what exactly are they and how can you avoid them?

See all articles

Sofy x tairinara, Shutterstock

Has a friend ever flaked on you? You had an outing all planned out, reservations or arrangements made, and—right before you get in the car to meet them—here comes a long-winded story about car troubles. 

They flaked on you. And it’s the worst.

A little bit of trust is lost and you’re beginning to question their motives. Did they really have car issues or was it something I said?

Unfortunately, it’s not just friends that can be flaky—QA tests can be flaky too.

What is a flaky test?

A flaky test is a test that fails to produce consistent results each time they’re run. Flaky tests can provide a false sense of confidence in your application or waste developer time and energy troubleshooting issues that, in fact, simply do not exist.

Tests can be flaky for two main reasons, (a) internal factors or (b) external factors. Internal flakiness results from the test class code, while external flakiness is related to a dependency outside the code, like third party integrations, network latency, and other matters that you don’t necessarily control.

Flaky tests are completely unreliable because they can provide a passing result one second and a failing result the next, with no accurate feedback to base decisions. Like your friend, you can’t count on them!

Here’s how to spot a flaky test and—importantly—what to do about it when you encounter one.

Examples of flaky tests

Flaky tests can take a variety of forms, all of them intensely annoying. Here are just a few examples:

  • A callout to an external service retrieves results one day and a 404 the next. Oftentimes, this occurs when testers don’t mock a service and instead rely on some flaky third party that they don’t control.
  • A test checks that the dynamic component is rendered after a specific action fails because not enough time is allowed to wait for the component to render. Sometimes it passes with better network speed.
  • A test fails when developers make small, insignificant changes, like changing the label of a button, moving the location of a component, or making small changes to navigation.
  • A test relies on another process completed before your test runs, causing sporadic behavior as sometimes the process is delayed.
  • Tests pass consistently on one device, like an Android mobile device, but fail on an iPhone—despite being developed from the same code-base.

Effects of flaky tests

Flaky tests are insidious. Their effects and existence can go undetected for a long time before being discovered. For example, left to work its sinister magic, an unidentified flaky test may unnecessarily slow development, causing tests that should pass to fail intermittently for seemingly no reason at all, leaving development scrambling to fix problems that don’t exist.

Flaky tests can also hide a direly serious bugs or defect by producing false positives (passes). For example, a test may not clear your browser cache before it executes, and so a test runs on a different credentialed session than expected. As a result, testers lack an accurate representation of critical security information, causing the testing team to make incorrect assumptions about their security. Sinister!

A third example, flaky tests can overrepresent a non-issue and underrepresented or hide a critical defect. Both situations can lead anywhere from inconvenience to a total disaster. In turn, it’s in your best interest to take flaky tests very seriously and ensure that they don’t manifest anywhere in your DevOps cycle.

Flaky pastry good, flaky test bad. Image: Mae Mu, Unsplash.

Most common causes of flaky tests 

Each test could be flaky for its own reason, but you’d be wise to keep an eye out for the following:

  • Flaky 3rd party API: A test is only as strong as its weakest link. Oftentimes, when a test relies on a third party API that it doesn’t control, it can be difficult to control an environment conducive to consistent tests.
  • Poorly written tests: If you haven’t defined your testing strategy, and if you’ve written your tests poorly, your tests will always be flaky. If test cases don’t feature a falsifiable assertion, meaning that in almost no situation will they fail, then the test is not doing its job. If a test class does not consider its dependency on other test classes, this is likely a source of flakiness. Since tests should be predictable, they need to initialize to the right state. Remember that your test needs to be reliable and predictable, and that your environment needs to be prepped. Tests that don’t initialize to the right state can also contribute to flaky tests. Another sign of a poorly written test is that it references specific coordinates that heavily relies on XPath. 
  • Lack of documentation: Documentation is an often overlooked aspect of testing. Are test cases documented to share what purpose they serve, what pre-conditions they need, and what assertions they make? Does the test fit the overall testing strategy? Unless you want a visit from the flaky test fairy, you should be sure that you answer yes to those questions!

No-code solutions

Avoiding flaky tests is one of the no-code revolution’s major strengths. For example, Sofy’s UI/UX testing capabilities allow you focus on the logic, and by way of the platform’s powerful no-code capabilities, automatically develops the best way to meet your test case.

Sofy’s machine learning capabilities determine the purpose and context behind your test, not just the steps to execute. This means that if you want to test adding an item to a cart, Sofy is intelligent enough to know what to do instead of how to do it, so if the add to cart button name changes, Sofy will still identify it.

Using AI to analyze application code changes, Sofy lets you know what tests will need to be refactored based on your changes. Sofy creates tests to be functional across every device and runs tests on real devices, not simulations. So whether you’re writing an Android or Apple, your tests will function the same.

Sofy also monitors user behavior within your application and can automatically build sound, repeatable test cases based on a foundation of best practices. This ensures that you’re not only testing the right thing, but you’re doing it right.

May your croissants be flaky—not your tests

Look, it’s no secret that flaky tests are the scourge of the QA cycle. They’re hard to identify and unreliable. They can leave serious gaps in your testing strategy and go undetected for a long time, and it a lot more painful to shift left.

Staying ahead of flaky tests involves a commitment to a quality testing strategy, documenting and following a consistent path to create test cases, including delineating what to test and when, and standardizing the assertions that are used. It also involves consistent monitoring of the current test library so updates are made as functionality changes.

No-code testing platforms like Sofy allow you to simplify and automate test case generation, documentation is ready to go, and a foundation of best practices allow you to write meaningful tests that won’t flake on you.

Unlike your so-called friend! 🙁


Sign up for a 14-day trial now!

1 2 3
Sofy announces $7.75 million funding in Seed round led by Voyager Capital and others. Read more