Monday, August 21, 2017

Andrew Gelman — Publish your raw data and your speculations, then let other people do the analysis: track and field edition

There seems to be an expectation in science that the people who gather a dataset should also be the ones who analyze it. But often that doesn’t make sense: what it takes to gather relevant data has little to do with what it takes to perform a reasonable analysis. Indeed, the imperatives of analysis can even impede data-gathering, if people have confused ideas of what they can and can’t do with their data.
I’d like us to move to a world in which gathering and analysis of data are separated, in which researchers can get full credit for putting together a useful dataset, without the expectation that they perform a serious analyses. I think that could get around some research bottlenecks.
It’s my impression that this is already done in many areas of science—for example, there are public datasets on genes, and climate, and astronomy, and all sorts of areas in which many teams of researchers are studying common datasets. And in social science we have the NES, GSS, NLSY, etc. Even silly things like the Electoral Integrity Project—I don’t think these data are so great, but I appreciate the open spirit under which these data are shared....
Statistical Modeling, Causal Inference, and Social Science
Publish your raw data and your speculations, then let other people do the analysis: track and field edition
Andrew Gelman | Professor of Statistics and Political Science and Director of the Applied Statistics Center, Columbia University

No comments: