# Don’t Torture Data, Ask Nicely

Today at Trusted Conf. we’re talking about data torture and why we don’t torture data but ask nicely instead, with Roberta Cardoso, Technical Data Analyst.

Watch the video below to get the full insights into why we don’t torture data.

View the slides:

# Don’t Torture Data, Ask Nicely

First a bit about Roberta:

HI, I’M BETA…

- Brazilian
- Data Analyst
- Passionate about Project Management for Big Data
- Mother of a sweet 10yo girl
- Tutor of 3 dogs and 6 cats
- Balancing nature and technology in everyday life

## Agenda

If you want reliable confessions, don’t torture data, ask nicely.

- Data torture
- Types of data torture
- Are you forcing confessions?
- Clues to data torture

## Data Torture

If you torture the data long enough, it will confess to anything.

– Darrell Huff

- The first time I read something similar to this quote was on the website of a Data Science Institute. They were using it as their motto: “
*We torture data until it confesses*“. - In the beginning, it made total sense to me, because I’m a Data Analyst, and my job is to get answers from the data.
- I Googled: “
*Data Torture*“, and found the root of this quote in a book from 1954, where the author picks apart how marketers manipulate statistics and data visualization to trick the public. The book is named “*How to Lie With Statistics*“. - At that point, I was relieved that I hadn’t changed my LinkedIn headline from Data Analyst to Data Torturer.
- It became evident for me that data torturing is less about answering questions, and more about forcing confessions for whatever the torturer wants to prove.

- Like other forms of torture, If it’s done skillfully, data torturing won’t leave incriminating evidence.
- So, the unfortunate result of torturing data is getting anything but the truth.
- In a word, data torturing is ethically problematic because neither the reported data nor the explanations or hypotheses the data torturer offers are all that trustworthy.

## Types of Data Torture

In 1994, Doctor James Mills published an article in The New England Journal of Medicine, where he refers to two types of data torture:

- Opportunistic
- Procrustean

### Opportunistic Data Torture

**Opportunistic:** Pores over the data until a “significant” association is found and then devises a plausible hypothesis to fit the association.

Opportunistic data torture makes it very hard for readers to tell that the positive association didn’t spring from a priori hypothesis.

*When many independent tests are performed, the probability of a correct conclusion drops drastically.*

**Example:**

If the CvR for a current ad and it’s creative variation differed by 5% or 10% how would we know whether the difference was due to chance?

For reasonably arbitrary reasons, a result is not due to chance if the P-value is less than 0.05 – which means that there is a 5% chance to conclude that the two ads differ when they actually don’t, and 95% probability that we will correctly infer that there is no difference between them.

The problem is: when many independent tests are performed that probability drops drastically, in a way that if we run 20 tests, it is only 36% of the probability of a correct conclusion.

### Procrustean Data Torture

**Procrustean: is performed by deciding on the hypothesis to be proved and making the data fit the hypothesis.**

It may produce results that are seen as definitive proof of the hypothesis

*It’s more difficult to carry out than opportunistic data torturing because it requires selective reporting, but its results are often more believable.*

It can take several forms:

- Exposure may be redefined in a way that strengthens the association. For example, one study looked at a website’s organic traffic due to some SEO improvements on the outcome of a notable uplift in CTR presumed 60 days before the intervention; the choice of an inappropriately extended period to define the exposure produced a positive result by including unknown interventions unrelated to the tested optimization.
- Study pages whose results don’t support the hypothesis may be dropped.

There is a chance of doing this unintentionally

- Can you see how easy it is to slip from one impression to something quite different by making different approaches to interpreting the data?
- Data torture simply reflects that if you keep coming at the data from different angles you can get a whole range of answers, and there is also a chance you are doing this unintentionally.

### Comparing a Current Value to an Average or Target Value

*Source: Torturing Data and the Misuse of Statistical Tools*

- A common method for analyzing data is to compare a current value to an average or target value.
- This form of data torturing
**may lead to acting on a perceived difference when none really exists**. If one had instead recognized the statistical thinking principle that all processes are variable,**the simple comparison to an average would have not provided a satisfactory basis for decision making.** - Comparisons to averages, specifications, and targets
**ignore common variability and treat every fluctuation as something special**. - Another example of the dangers associated with this type of analysis is: when the CTR results for a website has been plotted over time, and the monthly percentage drop twice the overall average and a “red flag” is given by ignoring the time progression.

### While Analysing Percentages for Small Samples

- Analysis and display methods using percentage data are also susceptible to data torturing. Because many of these methods do not incorporate a context of common variations they may encourage improper interpretations of variability.
- Inspection of Figure 1 may cause one to conclude, from a reliability standpoint that the end of lifetime has been found with the critical age for increased failures at approximately twenty-six years of age. However, if one examines the number of available samples for each age category, a different conclusion may be reached.
- Analysis and display of percentage data need a context of common variability for proper interpretation.

### Performing Trend Analysis

- Trend analysis is often misused to make decisions
**with either limited data points or with inadequate knowledge about the process of creating the data.** - This practice may result in data torturing
**by wrongly identifying the type of trend or by leading one to conclude that a trend exists when it doesn’t.** - Consider the data points shown in the Figure #1. It is difficult, if not impossible, to formulate a meaningful interpretation by only observing three data points without a broader contextual basis, it is all too common that this data would be labeled an “upward trend”.
- In Figures #2 and #3, the last three data points in each run chart
**are the same points as those given in Figure #1**. - Making decisions and taking action on perceived trends from limited data points
**will almost surely result in data torturing by either underreacting or overreacting**.

## Clues to Data Torture

Data torturing can rarely be proved. There are, however, clues that should arouse the reader’s suspicion. In conclusion, here are some of Mills’ recommendations for assessing allegedly statistically significant findings

- Did the reported findings result from testing a primary hypothesis or an a posteriori hypothesis?
- Does the hypothesis have good supporting data from previous studies?
- Does it use theoretical insights and an examination of previously reported data?
- Has data been reported for all groups in the study or were certain study groups excluded from analysis and why?
- Is the reported finding consistent across a wide range of values, or does it only apply to a selected group?
- Are the cutoff points for laboratory studies reasonable and justifiable or are they selected because they allow the results to be significant?
- Was the effect of multiple comparisons discussed and statistically managed?
- How many significant results were reported relative to the number of comparisons made?
- Was the research outcome defined before collecting the data?

### How to Avoid Forced Confessions

- These shortcomings make evident the importance of applying statistical thinking even when using basic statistical tools.
- As repeatedly shown, failure to consider the processes, variation, and data within the mindset of statistical thinking can result in faulty decisions and actions.

In summary, because statistical thinking requires a focus on the process, the application of the associated concepts will increase the effectiveness of statistical tools and help to prevent data torturing.

Don’t torture data, ask nicely.