Even if numbers don't lie, they can still be easily misinterpreted

·

13 min read

Let's start with an obvious example(example 1):

  1. Virus A has the average fatality rate of 10%(1 death per 10 infections on average)
  2. Virus B has the average fatality rate of 1%(1 death per 100 infections on average)

Which virus is more dangerous towards the majority?

If you think that the answer must be always virus A, then you're probably very prone to misinterpreting the numbers, because you're effectively passing judgments with too little information in this case.

What if I give you their infection rates as well?

  1. Virus A has the average infection rate of 2 every week(every infected individual infects 2 previously uninfected ones per week on average)
  2. Virus B has the average infection rate of 5 every week(every infected individual infects 5 previously uninfected ones per week on average)

First, let's do some math on the estimated death numbers after 4 weeks:

  1. Virus A death numbers = 2 ^ 4 * 0.1 = 1.6
  2. Virus B death numbers = 5 ^ 4 * 0.01 = 6.25

The counterparts after 8 weeks:

  1. Virus A death numbers = 2 ^ 8 * 0.1 = 25.6
  2. Virus B death numbers = 5 ^ 8 * 0.01 = 3906.25

I think it's now clear enough that, as time progresses, the death numbers by virus B over that of virus A will only be larger and larger, so this case shows that, the importance of infection rates can easily outclass that of the death rates when it comes to evaluating the danger of a virus towards the majority.

Of course, this alone doesn't mean that virus B must be more dangerous towards the majority, but this is just an easy, simple and small example showing that how numbers can be misinterpreted, because in this case, judging from a single metric alone is normally dangerous.

Now let's move on to a more complicated and convoluted example(example 2):

  1. Country A, having 1B people, has 1k confirmed infection cases of virus C after 10 months of the 1st confirmed infection case of that virus in that country
  2. Country B, having 100M people, has 100k confirmed infection cases of virus C after 1 month of the 1st confirmed infection case of that virus in that country

Which country performed better in controlling the infections of virus C so far?

Now there are 3 different yet interrelated metrics for each country, so the problems of judging from a single metric is gone in this example, therefore this time you may think that it's safe to assume that country A must have performed better in controlling the infections of virus C so far.

Unfortunately, you're likely being fooled again, especially when I give you the numbers of tests over virus C performed by each country on that country:

  1. Country A - 10k tests performed over virus C on that country
  2. Country B - 10M tests performed over virus C on that country

This metric on both country, combined with the other metrics, reveal 2 new facts that point to the opposite judgment:

  1. Country A has just performed 10k / 10 / 1B = 0.0001% number of tests over virus C on that country over its populations per month on average, while country B has performed 10M / 100M = 10% on that regard
  2. 1k / 10k = 1 case out of 10 tested ones is infected in country A on average, while that in country B is 100k / 10M = 1 out of 100

So, while it still doesn't certainly imply that country B must have performed better in controlling the infections of virus C so far, this example aims to show that, even using a set of different yet interrelated metrics isn't always safe from misinterpreting them all.

So, why numbers can be misinterpreted so easily? At the very least, because numbers without contexts are usually ambiguous or even meaningless, and realizing the existence of the missing contexts generally demands relevant knowledge.

For instance, in example 2, if you don't know the importance of the number of tests, it'd be hard for you to realize that even the other 3 metrics combined still don't form a complete context, and if most people around the world don't know that, some countries can simply minimize the number of tests performed over virus C on those countries, so their numbers will make them look like that they've been performing incredibly well in controlling the infections of virus C so far, meaning that numbers without contexts can also lead to cheating by being misleading rather than outright lying.

Sometimes, contexts will always be incomplete even when you've all the relevant numbers, because some contexts contain some important details that are very hard to be quantified, so when it comes to relevant knowledge, knowing those details are crucial as well.

Let's consider this example(example 3) of a team of 5 employees who are supposed to handle the same set of support tickets every day, and none of them will receive any overtime compensations(actually having overtime will be perceived as incompetence there):

  1. Employee A, B, C and D actually work on the supposed 40 hour work week every week, and each of them handles 20 support tickets(all handled properly) per day on average
  2. Employee E actually works on 80 hour work week on average instead of the supposed 40, and he/she handles 10 support tickets(all handled properly) per day on average

Does this mean employee E is far from being on par with the rest of the team? If you think the answer must be always yes, then I'm afraid that, you've yet again misused those KPIs, because in this case, the missing contexts at least include the average difficulty of the support tickets handled by those employees, and such difficulty is generally very hard to quantify.

You may think that, as all those 5 employees are supposed to handle the same set of support tickets, the difficulty difference among the support tickets alone shouldn't cause such a big difference among the apparent productivity between employee A, B, C and D, and employee E.

But what if I tell you that, it's because the former 4 employees have been only taking the easiest support tickets since day 1, and all the hardest ones are always taken by employee E, which is due to the effectively dysfunctional internal reporting mechanisms against such workplace bullying, and employee E is especially vulnerable to such abuses?

Again, whether that team is really that toxic is also very hard to be quantified, so in this case, even if you've all the relevant KPIs on the employee performance, those KPIs as a single set can still be very misleading when it's used on its own to judge their performance.

Of course, example 3 is most likely an edge case that shouldn't happen, but that doesn't mean such edge cases will never appear.

Unfortunately, many of those using the KPIs to pass judgment do act as if those edge cases won't ever exist under their management, and even if they do exist, those guys will still behave like it's those edge case themselves that are to be blamed, possibly all for the illusory effectiveness and efficiencies.

To be blunt, this kind of "effectiveness and efficiency" is indeed just pushing the complexities that should be at least partially handled by those managers to those edge case themselves, causing the latter to suffer way more than what they've been already suffering even without those extra complexities that are just forced onto them.

While such use of KPIs do make managers and the common cases much more effective and efficient, they're at the cost of sacrificing the edge cases, and the most dangerous part of all is that, too often, many of those managers and common cases don't even know that's what they've been doing for ages.

Of course, this world's not capable to be that ideal yet, so sometimes misinterpreting the numbers might be a necessary or lesser evil, because occasionally, the absolute minimum required effectiveness and efficiencies can only be achieved by somehow sacrificing a small amount of edge cases, but at the very least, those using the KPIs that way should really know what they're truly doing, and make sure they make such sacrifices only when they've to.

So, on one hand, judging by numbers alone can easily lead to utterly wrong judgments without knowing, while on the other hand, judging only with the full context isn't always feasible, practical nor realistic, therefore a working compromise between these 2 extremes should be found on a case-by case basis.

For instance, you can first form a set of educated hypotheses based on the numbers, then try to further prove and disprove(both sides must be worked on) those hypotheses on one hand, and act upon them(but always keep in mind that those hypotheses can be all dead wrong) if you've to on the other, as long as those hypotheses haven't been proven to be all wrong yet(and contingencies should be planned for so you can fix the problems immediately).

With such a compromise, effectiveness and efficiency can be largely preserved when those hypotheses work because you're still not delaying too much when passing judgments, and the damages caused by those hypotheses when they're wrong can also be largely controlled and contained because you'll be able to realize and correct your mistakes as quickly as possible.

For instance, in example 3, while it’s reasonable to form the hypothesis that employee E is indeed far from being on par with the rest of the team, you should, instead of just acting on those numbers directly, also try to have a personal meeting with that employee as soon as possible, so you can express your concerns on those metrics to him/her, and hear his/her side of the story, which can be very useful on proving or disproving your hypothesis, causing both of you to be able to solve the problem together in a more informed manner.