A lesson from R-fortunes: all science is not good science.

images-2

“It is becoming apparent that you do not know how to use the results from either system. The progress of science would be safer if you get some advice from a person that knows what they are doing.”

— David Winsemius (in response to a user that obtained different linear regression results in R and SPSS and wanted to know which one to use)      R-help (July 2011)

I can always count on my fortunes R-package for a good laugh (especially at the expense of SPSS users), however, this post raises an interesting point about the misuse of statistics.

First, let me digress. Before undergraduate level coursework in psychology, I didn’t know much about the way people acted. After some undergraduate level classes, I knew everything about the inner workings of the mind. I knew that priming people with stereotypically older words reduced their walking speed (Bargh, Chen, & Burrows, 1996), that the Implicit Association Test (IAT; Greenwald et al., 2002) measured meaningful unconscious attitudes, that narcissism was associated with using more first person pronouns (Raskin & Shaw, 1988), etc. It wasn’t until several years in graduate school, advanced statistical training, reading some meta-research, and a visit from the replication police that I realized a) that the findings are never as clear cut as they seem and b) all of these findings have been called into question (Priming; Doyen, Klein, Pichon, Cleeremans, 2012; Pronouns: Carey et al., 2015; IAT; Blanton et al., 2009). Further reading reveals p-hacking (Simonsohn Nelson, & Simmons, 2014), incredibility indices (Schimmack, 2012), and that half of all published findings may be false (Ioannidis, 2005).

I hope this digression illustrates the point that a little knowledge and a false sense of understanding can be dangerous. A novice statistician who runs participants until his or her hypotheses are statistically significant might not realize he/she just increased type one error rate to 20% despite a p < .05 statistical test (Sherman, 2014), but those findings get published.

This brings me back to the original (humorous) quote from my R-fortunes package. Misuse and misunderstanding of analyses are some of the reasons that so few findings across many scientific disciplines do not replicate (Freedman, Cockburn, & Simcoe, 2015). I think the ‘take away’ from this ‘fortune’ (and blog post) is that statistics are often misused and abused, sometimes knowingly and other time unwittingly. The scientific process is slow and self-correcting, but not perfect. Published papers are not necessarily error free. Interpret analyses cautiously. Interpret the research of others cautiously. Most importantly, use R, not SPSS.

References

Bargh, J. A., Chen, M., & Burrows, L. (1996). Automaticity of social behavior: Direct effects of trait construct and stereotype activation on action. Journal of personality and social psychology, 71(2), 230.

Blanton, H., Jaccard, J., Klick, J., Mellers, B., Mitchell, G., & Tetlock, P. E. (2009). Strong claims and weak evidence: reassessing the predictive validity of the IAT. Journal of Applied Psychology, 94(3), 567.

Carey, A. L., Brucks, M. S., Küfner, A. C. P., Holtzman, N. S., große Deters, F., Back, M. D., Donnellan, M. B., Pennebaker, J. W., & Mehl, M. R. (2015, March 30). Narcissism and the Use of Personal Pronouns Revisited. Journal of Personality and Social Psychology. Advance online publication. http://dx.doi.org/10.1037/pspp0000029

Doyen S, Klein O, Pichon C-L, Cleeremans A (2012) Behavioral Priming: It’s All in the Mind, but Whose Mind? PLoS ONE 7(1): e29081. doi:10.1371/journal.pone.0029081

Freedman LP, Cockburn IM, Simcoe TS (2015) The Economics of Reproducibility in Preclinical Research. PLoS Biol 13(6): e1002165. doi:10.1371/journal.pbio.1002165

Greenwald, A. G., Banaji, M. R., Rudman, L. A., Farnham, S. D., Nosek, B. A., & Mellott, D. S. (2002). A unified theory of implicit attitudes, stereotypes, self-esteem, and self-concept. Psychological review, 109(1), 3.

Ioannidis, J. P. (2005). Why most published research findings are false. Chance, 18(4), 40-47.

Raskin, R., & Shaw, R. (1988). Narcissism and the use of personal pronouns. Journal of Personality, 56, 393–404. http://dx.doi.org/ 10.1111/j.1467-6494.1988.tb00892.x

Simonsohn, U., Nelson, L. D., & Simmons, J. P. (2014). P-curve: A key to the file-drawer. Journal of Experimental Psychology: General, 143(2), 534.

Schimmack, U. (2012). The ironic effect of significant results on the credibility of multiple-study articles. Psychological Methods, 17(4), 551.

Sherman, R.A. (2014) phack: An R Function for Examining the Effects of p-hacking. retrieved from: http://rynesherman.com/blog/phack-an-r-function-for-examining-the-effects-of-p-hacking/

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>