Test Metrics: The 4 That Are Critical to Your QA Health

Today we're here to talk about test metrics. To start understanding why this topic is so crucial, consider this sentence:…

Testim
By Testim,

Today we’re here to talk about test metrics. To start understanding why this topic is so crucial, consider this sentence: “If you can’t measure it, you can’t improve it.” Have you ever heard that saying? If I had to bet, I’d say yes. This is one of the most well-known quotes in business management; people often attribute it to management guru Peter Drucker.

What does that quote mean? Put shortly, you can’t know if a given process is improving if you don’t capture data about it. Think of a sprinter training for the Olympic Games. How are they supposed to know whether their time is improving if their trainer doesn’t carefully write it down?

Well, metrics aren’t just essential for professional sports—or business management, for that matter. The software industry also relies heavily on them. All software development subfields have many metrics you can track, and testing is certainly no exception. Not all metrics are created equal, however. While some are highly useful, others are less so. Some might be outright misleading. How are you to distinguish the good ones from the bad ones?

That’s exactly what this post is about. We’re going to show you a list of 4 test metrics that you should be aware of if you want your organization’s QA strategy to remain top of its game. Let’s get started.

Cyclomatic Complexity

Cyclomatic complexity is, strictly speaking, less of a test metric and more of a general coding one. That is, it doesn’t apply only to tests and testing code, but to code in general. However, we’ve felt that it’d make sense to include it because it’s closely related to testing.

What Is It?

So, what is cyclomatic complexity? Put shortly, it’s a metric that indicates the number of possible paths in a block of code. Consider the following code sample:

function add(a, b) {
    return a + b;
}

As you can see, the code above presents a single possible path, so its cyclomatic complexity is 1. Now consider the following, slightly changed, version of the same function:

function add(a, b, allowNegativeNumbers) {
    if (allowNegativeNumbers)
	return a + b;
		
    return Math.abs(a) + Math.abs(b);
}

Here we have a third parameter, which callers of the function can use to indicate whether negative numbers should be allowed. If so, the addition is performed as normal. On the other hand, if negative numbers aren’t to be allowed, then the function converts both a and b to their absolute values before performing the sum. The second version of the code, then, clearly has a cyclomatic complexity of 2, since it has two possible execution paths: it can evaluate the if condition as “true” or not.

Why Does It Matter?

You might be wondering how does all of that affect testing. Well, the higher the cyclomatic complexity of a given function, the more test cases you’ll need in order to test it. For instance, the function above would need at the bare minimum two test cases: one for each possible result of the evaluation of the if statement condition.

So, keeping cyclomatic complexity low makes your testing efforts easier, since it decreases the minimum number of test cases needed. Besides, code with lower complexity is easier to understand and maintain and general, which means fewer bugs will be introduced in the codebase, to begin with.

How to Measure It?

It’s fairly easy to calculate the cyclomatic complexity of a single, small function. You start at 1 and increment by 1 every time you find a branching or looping instruction. The problem is that doing this manually doesn’t scale.

Even though you can find standalone tools for calculating this metric, you’d probably be better off employing linter tools such as JSHint or ESLint. They have rules that deal not only with cyclomatic complexities, but also several other code smells, and can help you improve the quality of your JavaScript code.

Code Coverage

The next item on our list is certainly the most well-known test metric: code coverage.

What Is It?

What is code coverage? Given a piece of code (it could be a function, a whole class or module, or even the entire codebase), code coverage shows the percentage that’s covered by tests. So, if your entire codebase has only four functions and you have tests that cover three of them, you have 75% of code coverage.

Why Does It Matter?

Why is code coverage an essential metric? To understand that, you have to take a step back and think of the role unit tests play in a testing strategy. Unit testing is a very particular type of testing, in that it can be somewhat unintuitive. Unlike other forms of testing, such as end-to-end testing, unit testing’s main purpose isn’t really to demonstrate to project stakeholders or users that the application works as intended.

Instead, unit testing is meant to give developers a confidence boost: knowing that they have a safety net—or an “alarm system,” as I like to say—developers will feel more confident in making changes to the code in order to improve it. So, a vital component of a successful unit testing strategy is trust. Developers need to trust their test suite; otherwise, the whole effort is useless.

That’s why code coverage is important. Developers won’t be able to trust their tests if they only cover a tiny percentage of the codebase.

How to Measure It?

There are plenty of tools available you can use to measure code coverage. If you use, for instance, QUnit as a unit test framework for JavaScript along with the Karma test runner, you can easily see coverage data by using the Karma-Coverage plugin. Another famous JavaScript tool used for code coverage is Istanbul.

Finally, some tools, like Jest, come with built-in code coverage.

Manual Testing vs. Automated Testing Ratio

As for our last metric, we have manual vs. automated testing ratio.

What Is It?

This metric is exactly what its name suggests: the proportion of manual test cases in relation to the automated testing ones. This might come as a surprise to some readers, given we advocate for automated testing so strongly and frequently. On this very blog, I’m guilty of saying that you should automate everything that can be automated.

I stand for what I said. The thing is: not everything can be automated. And that’s why manual testing still has a role to play in an efficient testing approach. As it turns out, a manual approach is well-suited for some specific types of testing, such as usability and exploratory testing.

Why Does It Matter?

Having said that manual testing still has its place, I must add that said place should be as small as possible. A large number of manual test cases will only get in your way, slowing down your process. You might be familiar with the concept of the testing pyramid. In short, your testing strategy should be mostly comprised of automated tests, with only a few manual tests scattered on top. That way, you can have your cake and eat it too. You benefit from the speed, repetitiveness, and reliability of automated tests, while still being able to count on manual tests in the areas they can’t be replaced.

In other words: when it comes to the manual vs. automated testing debate, the answer lies in finding the right balance. And the right balance is heavily biased to the automated side.

How to Measure?

To obtain this metric, you’ll need the help of the whole QA department. It’s more of a management problem and less of a technical one. You’ll need to obtain the number of monthly hours devoted to creating and maintaining automated tests of all kinds. You don’t need to factor in the time spent executing the tests since that’s where the “automated” part does its magic. You’ll also need to obtain the total number of monthly hours spent creating, reviewing, and updating test plans, as well as in the test execution itself. Then, it’s only a matter of performing a division.

If you find out that manual tests account for more than 20% of all your testing efforts, that’s a sign you need to pursue automation more aggressively.

Test Metrics: Use Them Wisely

I’ve started this post by claiming that metrics are vital in software development—and business, and sports, and so on. Don’t let this section’s title mislead you: I haven’t had a change of mind while writing the post. But I couldn’t wrap-up the post without mentioning an important caveat about metrics: although they’re useful, they can be used in ways that bring more harm than benefits.

To understand why that’s the case, I ask you to consider the following quote: “When a measure becomes a target, it ceases to be a good measure.” This saying is known as Goodhart’s Law, named after the economist Charles Goodhart. And what does that mean? Put simply, when you pick a metric and use it as a deciding factor for some policy, the metric will inevitably become useless, because people will game it. Imagine the head of a support department decided to tie employees bonuses to the number of support tickets they closed in a given period. Well, it’s not hard to imagine what would happen.

Now, imagine a similar scenario in QA. Someone in an organization turns something like “total number of unit test cases” into a target, rather than just a metric. It’s now a goal to be reached. What will most likely happen is that unnecessary unit tests are going to be created in order to inflate the number and reach the goal.

So, in short: resist the urge to turn metrics into goals, and you’ll reap all of their benefits without the drawbacks.

Carlos Schults is the author of this post. Carlos is a .NET software developer with experience in both desktop and web development, and he’s now trying his hand at mobile. He has a passion for writing clean and concise code, and he’s interested in practices that help you improve app health, such as code review, automated testing, and continuous build.