Most evidence in the empirical sciences is statistical in nature, and scientists rely on a variety of statistical tests to distinguish valid scientific discoveries from spurious ones. Unfortunately, there is a growing recognition that many important research findings based on statistical evidence are not reproducible, raising the question of whether there is a gap between what these statistical tests ensure and the way they are used.
While there are many reasons that research findings may be non-reproducible this work is focused on just one—interactivity. Existing statistical methods assume the procedure being used to analyze the data is fixed before the data is collected. For example, performing a regression analysis on a fixed set of variables. However, in practice, the methods used to analyze a dataset are often chosen based on previous interactions with the same dataset. For example, using the same dataset to first select variables and then perform a regression analysis. Analyzing a fixed dataset in this interactive manner is known to lead to spurious conclusions, even when each procedure is statistically sound in isolation.
No entries found