Showing posts with label statistics. Show all posts
Showing posts with label statistics. Show all posts

Monday, September 30, 2019

That study about the health risks of red meat: An excellent news report — Andrew Gelman


Most interesting for the analysis of the process relative to statistical reasoning and news reporting, which happens to be the bedrock of economics and the narratives based on conventional economics. Short and not wonkish.

Statistical Modeling, Causal Inference, and Social Science
That study about the health risks of red meat: An excellent news report
Andrew Gelman | Professor of Statistics and Political Science and Director of the Applied Statistics Center, Columbia University

Friday, September 6, 2019

It’s not just p=0.048 vs. p=0.052 — Andrew Gelman


Numbers may be eternal and unchanging, but they are not gods. Andrew Gelman observes that we also have to step back and use our common sense regarding what the numbers actually say, instead of drawing arbitrary lines based on self-imposed criteria like "significance" and then take them as "messages from the gods." It doesn't work like that. Formalism only goes so far.
So. Yes, it seems goofy to draw a bright line between p = 0.048 and p = 0.052. But it’s also goofy to draw a bright line between p = 0.2 and p = 0.005. There’s a lot less information in these p-values than people seem to think. [Just about everyone would say that p = 0.2 is insignificant and p = 0.005 is significant.]
So, when we say that the difference between “significant” and “not significant” is not itself statistically significant, “we are not merely making the commonplace observation that any particular threshold is arbitrary—for example, only a small change is required to move an estimate from a 5.1% significance level to 4.9%, thus moving it into statistical significance. Rather, we are pointing out that even large changes in significance levels can correspond to small, nonsignificant changes in the underlying quantities.”
Moral of the story: Math and statistics are tools for human use, not messengers from heaven.

Statistical Modeling, Causal Inference, and Social Science
It’s not just p = 0.048 vs. p = 0.052
Andrew Gelman | Professor of Statistics and Political Science and Director of the Applied Statistics Center, Columbia University

Sunday, July 14, 2019

Gigerenzer: “The Bias Bias in Behavioral Economics,” including discussion of political implications — Andrew Gelman


Gerd Gigerenzer takes aim at Daniel Kahneman, Richard Thaler and Cass Sunstein for being uncritical and going too far. While not endorsing rational choice theory, he stresses that the truth lies between the extremes of rationality and irrationality and claims behavioral economics tends to over emphasize irrationality consequent on cognitive-effective bias. It's neither reason or all bias, either all or mostly, but a combination of rationality and irrationality.

Statistical Modeling, Causal Inference, and Social Science
Gigerenzer: “The Bias Bias in Behavioral Economics,” including discussion of political implications
Andrew Gelman | Professor of Statistics and Political Science and Director of the Applied Statistics Center, Columbia University

See also by Andrew Gelman

The butterfly effect: It’s not what you think it is.

The piranha problem in social psychology / behavioral economics: The “take a pill” model of science eats itself

Monday, July 8, 2019

My Journey from Theory to Reality — Asad Zaman

Over the twenty years that I have been pursuing an Islamic approach — focusing on the production of USEFUL knowledge, I have managed to heal all three of these divides. This happens naturally, when you focus on solution of real world problems. You automatically need to combine information coming from many different specialization areas. You need to use reasoning and also intuition. You also need to use both theory and its applications to the real world experiences. This leads to substantial changes in the subject matter itself. I have applied this approach with great success to Econometrics, Statistics, Microeconomics, Macroeconomics, Experimental Economics, and even Mathematics itself. I am in process of creating textbooks and teaching materials in all of these areas. Because my work is most advanced in the area of Statistics, I am working on putting it all together in a new course on Real Statistics: An Islamic Approach. There is a large amount of pre-existing material – lectures, texts, exercises, references – that I have created over the past decade on working on this course. However, as I progress, I keep learning new things, and this time I want to put together a polished new version of this course for public use. My primary target audience is teachers of statistics — I would like to persuade them to use this new approach to teach statistics. Those who would like to follow my progress as I construct a new website on a lecture by lecture basis gradually are encourged to fill in the following Registration form. I will use emails to notify them when I complete a new lecture, and also invite feedback on what is there, so that we can build it up with clarity and consensus....
All thinking, since humans think in language, is based on context, meaning being determined by context. The shaper of context is the worldview in which the group is functioning. In the West, the contemporary worldview was shaped by the Western history, chiefly Greek thought, Judaeo-Christian religion, Roman law, and modern science. Its intellectual products were shaped by the Western intellectual tradition that culminated most recently in the rise of science, which new supervenes over what preceded it. The basic assumption of the Western scientific world view is methodological naturalism, which many if not most of the foremost exponents equate with metaphysical materialism.

This is taking place in the overarching worldview of Western liberalism that was developed in the 18th century as an antidote to theological dogmatism. Scientific naturalism and the ideal of unified scientific explanation, or consilience, replaced the great chain of being, as the dominant paradigm of explanation.

Regarding social, political and economic thought, many if not most of the foremost authorities equate economic liberalism with Western capitalism  as the dominant mode of production and also view political liberalism in the form of representative democracy being determined by capitalism as economic liberalism. Initially, economic liberalism implied laissez-faire and sought to replace government by the market. Subsequently, when it become clear that government was needed for institutional structure, classical economic liberalism shifted to neoliberalism, which is the view that economic and financial interest should control government and direct institutional arrangements and operations toward furthering economic interests.

While the West is still the most influential bloc worldwide, that is beginning to change. The rest of the world, which had accepted the assumptions on which this worldview is based owing to the success of the West. Now many are beginning to question whether these assumptions are as robust as they seemed as problems arise and the paradoxes of liberalism manifest.

Consequently, some of those that had accepted the Western stance previously and were also educated in it are beginning to rethink their positions in light of the traditional worldviews that prevail in their societies. Many of the traditional worldviews are embedded in a religious contexts that have become cultural. Even in secular China, President Xi is resurrecting Confucius as a cultural icon, and in the supposedly secular US, dominant religious groups are asserting influence more openly, with science itself subject to challenge when it is perceived to conflict with tradition.

Asad Zaman's post is good example of this rising trend, as well as what a highly educated person asking such questions might do about it. This process is an iteration of the historical dialectic as liberal and traditionalism interact to forge a complementary Zeitgeist that moves history forward a step.

What should a "good" liberal think about this? Freedom of thought and expression are fundamental to liberalism this implies tolerance. So the answer is given by none other than Mao Tse-Tung, "Let a hundred flowers bloom."

Asad Zaman makes one other point worth sharing for those that may not choose to read his post in full.
Sometime during this process of switching from teaching theory to teaching how to solve real world problems, I came across the “Statistics” textbook of David Freedman. This textbook actually implemented exactly this idea that I had come to believe in — do statistics in context of solving real world problems. One amazing characteristic of this textbook is that it has no mathematical formula – ZERO. Freedman explained that students use formulae as crutches to prevent them from thinking. So he explains all concepts in words only, exactly the same insight that I had learnt on my own. Formulas teach you techniques for calculation. We don’t need these techniques — leave them to the computer. We need to UNDERSTAND what these calculations mean. That is a VERY DIFFERENT process. I got involved in an email correspondence with David Freedman, who had very similar experience to mine. He had started out as a very heavily mathematically oriented researchers. His early papers are all very heavy mathematically. Later, when he got involved in doing some testimony in real world court cases, he realized that all of the theory he had learnt was useless in the real world. This is because the assumptions we make in theory are almost always false in the real world. Then he had to learn how to do real world statistics, exactly as I have had to do. Since most fancy assumptions we make in statistics and econometrics are wrong, we need to learn how to do simple and basic inferences, which actually makes life much easier for students of the subject — we need to teach them basic and intuitive things, not complex models and math....  
An Islamic Worldview
My Journey from Theory to Reality
Asad Zaman | Vice Chancellor, Pakistan Institute of Development Economics and former Director General, International Institute of Islamic Economics, International Islamic University Islamabad

Saturday, June 1, 2019

Timothy Taylor — Pareidolia: When Correlations are Truly Meaningless

"Pareidolia" refers to the common human practice of looking at random outcomes but trying to impose patterns on them. For example, we all know in the logical part of our brain that there are a roughly a kajillion different variables in the world, and so if we look through the possibilities, we will will have a 100% chance of finding some variables that are highly correlated with each other. These correlations will be a matter of pure chance, and they carry no meaning. But when my own brain, and perhaps yours, sees one of these correlations, I can feel my thoughts start searching for a story to explain what looks to my eyes like a connected pattern.…
Classes in statistics emphasize that "correlation doesn't mean causation." The lesson here is even stronger. Correlation doesn't necessarily mean anything at all.Classes in statistics emphasize that "correlation doesn't mean causation." The lesson here is even stronger. Correlation doesn't necessarily mean anything at all.
Conversable Economist
Pareidolia: When Correlations are Truly Meaningless
Timothy Taylor | Managing editor of the Journal of Economic Perspectives, based at Macalester College in St. Paul, Minnesota

Saturday, February 9, 2019

Andrew Gelman — Our hypotheses are not just falsifiable; they’re actually false.


On the practical side of philosophy of science. Adding nuance to Karl Popper on falsification.

Further argument for the view that theories are useful but not "true." This may seem to contradict the realist view that theories are general descriptions of causal relationships. But I don't think that this is what is is implied. Rather, useful theories can be viewed as fitting the data because they reveal underlying structures that are not observed directly but only indirectly. 

There is a often a tendency to transfer simple analogies too complicated and complex situations and events. Some causal relationship are observable, as it a hammer driving a nail, with the physical theory explaining it in terms of simple variables related in a function. 

But most interesting issues are much more complicated and nuanced and may be complex, e.g., subject to emergence owing to synergy. There may a constellation of factors involved, and this may be difficult to order in a hierarchy. Some factors may be catalysts that are necessary for an operation but do not themselves enter into it. These may be presumptions that are hidden assumptions.

In addition, statistics is by definition "inexact" in that it deals with probabilities, unlike deterministic functions in which the variables are all known and measurable, and are expressible in terms of a simple function.

While physics is mostly tractable other than at the edges, life sciences are less so, and social sciences and psychology even less. Economics combines social science and psychology, especially macroeconomics and political economy. Economic sociology and economic anthropology take this into account, global economic history also demonstrates it.

This is coming to the fore now as some critics of MMT, the Green New Deal, and "socialism" demand to see data-based model that "prove" proposed solutions have worked in the past. Of course, the record is important, but the demand for "proof" requires a degree of stringency that is not applied in social science and psychology because it is unattainable. Nor is this standard applied to conventional economics either, its econometric approaching being based on formalism rather than being empirically based.

Another important point that Andrew Gelman makes is the futility of pitting theories against each other. That is a recipe for disagreement in that the party that determines the framing wins. Whose assumptions are going to set the criteria? Why?
And, no, I don’t think it’s in general a good idea to pit theories against each other in competing hypothesis tests. Instead I’d prefer to embed the two theories into a larger model that includes both of them.
This is a good suggestion but it is general. Often, the disagreement is over fundamental criteria that determine a frame of reference. This should be obvious in the different approaches to economic theory and economic practice., e.g., econometric and institutional, static and dynamic, simple and complex, natural and historical.

Obviously, a short post like this can only suggest matters that need deeper reflection, open inquiry and sincere debate aimed at solutions to pressing design problems. This is no long just "theoretical." Humanity has to get this right to survive, let alone prosper. We have seemingly dug ourselves into a hole based on policy that is has turned out to impractical in the extreme, such as socializing negative externalities that have led to environmental degradation and threaten ecological collapse if not addressed successfully in a timely fashion. So, let's get with it.

Statistical Modeling, Causal Inference, and Social Science
Our hypotheses are not just falsifiable; they’re actually false.
Andrew Gelman | Professor of Statistics and Political Science and Director of the Applied Statistics Center, Columbia University

Tuesday, November 20, 2018

Daniel Hruschka — You Can't Characterize Human Nature If Studies Overlook 85 Percent Of People On Earth


Non-random sampling.
… a nonrandom sample tells us about a population, but we don’t know how precisely: we can’t determine a margin of error or a confidence level.
A lot of mistakes occur from generalizing special cases. This tendency to overgeneralize, along with the tendency to absolutize, often infects formulation of assumptions in "scientific" modeling.

econintersect
You Can't Characterize Human Nature If Studies Overlook 85 Percent Of People On Earth
Daniel Hruschka | Professor and Associate Director of the School of Human Evolution and Social Change , Arizona State University

Thursday, September 13, 2018

Andrew Gelman — N=1 survey tells me Cynthia Nixon will lose by a lot (no joke)


One way that heuristic thinking works.

Statistical Modeling, Causal Inference, and Social Science
N=1 survey tells me Cynthia Nixon will lose by a lot (no joke)
Andrew Gelman | Professor of Statistics and Political Science and Director of the Applied Statistics Center, Columbia University

Friday, March 16, 2018

Andrew Gelman — Gaydar and the fallacy of objective measurement

Stripping a phemenon of its social context, normalizing a base rate to 50%, and seeking an on-off decision: all of these can give the feel of scientific objectivity—but the very steps taken to ensure objectivity can remove social context and relevance.
Statistical Modeling, Causal Inference, and Social Science
Gaydar and the fallacy of objective measurement
Andrew Gelman | Professor of Statistics and Political Science and Director of the Applied Statistics Center, Columbia University

Wednesday, March 14, 2018

Andrew Gelman — Classical hypothesis testing is really really hard


Take the test. 😁 (Just kidding unless you are a statistician.)

Statistical Modeling, Causal Inference, and Social Science
Classical hypothesis testing is really really hard
Andrew Gelman | Professor of Statistics and Political Science and Director of the Applied Statistics Center, Columbia University

Wednesday, December 13, 2017

Andrew Gelman — Yes, you can do statistical inference from nonrandom samples. Which is a good thing, considering that nonrandom samples are pretty much all we’ve got.

To put it another way: Sure, it’s fine to say that you “cannot reach external validity” from your sample alone. But in the meantime you still need to make decisions. We don’t throw away the entire polling industry just cos their response rates are below 10%; we work on doing better. Our samples are never perfect but we can make them closer to the population.
Remember the Chestertonian principle that extreme skepticism is a form of credulity.
Making assumptions is necessary. However, it is also necessary to recognize and acknowledge limitations. Formal modeling is never more accurate for the math than the assumptions permit.

Reasoning is a tool of intelligence. It is not a magic wand. Taking reasoning for a magic wand because it is highly formalized is magical thinking.

It is important to distinguish necessity from contingency. Necessity is based on logic necessity (tautology) and logical impossibility (contradiction). These are purely syntactical, that is, based on applying rules to signs. Logical necessity is probability one; contradiction is probability zero. All description is contingent on observation.

Statistics is a reasoning tool for dealing with contingency. The formal aspect of the tool does not vary, but its application is dependent on assumption and measurement. Thinking that the results will be the same owing to the invariant formal aspect is a mistake. Results can never be more precise than measurements or more accurate than assumptions permit, no matter how rigorous the formal methods applied.

Statistical Modeling, Causal Inference, and Social Science
Yes, you can do statistical inference from nonrandom samples. Which is a good thing, considering that nonrandom samples are pretty much all we’ve got.
Andrew Gelman | Professor of Statistics and Political Science and Director of the Applied Statistics Center, Columbia University

Thursday, November 2, 2017

Noah Smith — Why 'Statistical Significance' Is Often Insignificant

The knives are out for the p-value. This statistical quantity is the Holy Grail for empirical researchers across the world -- if your study finds the right p-value, you can get published in a credible journal, and possibly get a good university tenure-track job and research funding. Now a growing chorus of voices wants to de-emphasize or even ban this magic number. But the crusade against p-values is likely to be a distraction from the real problems afflicting scientific inquiry....
The real danger is that when each study represents only a very weak signal of scientific truth, science gets less and less productive. Ever more researchers and ever more studies are needed to confirm each result. This process might be one reason new ideas seem to be getting more expensive to find.
If we want to fix science, p-values are the least of our problems. We need to change the incentive for researchers to prove themselves by publishing questionable studies that just end up wasting a lot of time and effort.
There is a difference in proving that one has the ability to use the tools of one's trade and using the tools to produce authentic, useful and elegant output.

Saturday, October 28, 2017

Andrew Gelman — My favorite definition of statistical significance

From my 2009 paper with Weakliem:
Throughout, we use the term statistically significant in the conventional way, to mean that an estimate is at least two standard errors away from some “null hypothesis” or prespecified value that would indicate no effect present. An estimate is statistically insignificant if the observed value could reasonably be explained by simple chance variation, much in the way that a sequence of 20 coin tosses might happen to come up 8 heads and 12 tails; we would say that this result is not statistically significantly different from chance. More precisely, the observed proportion of heads is 40 percent but with a standard error of 11 percent—thus, the data are less than two standard errors away from the null hypothesis of 50 percent, and the outcome could clearly have occurred by chance. Standard error is a measure of the variation in an estimate and gets smaller as a sample size gets larger, converging on zero as the sample increases in size.
Statistical Modeling, Causal Inference, and Social Science
My favorite definition of statistical significance
Andrew Gelman | Professor of Statistics and Political Science, and director of the Applied Statistics Center at Columbia University

Tuesday, October 3, 2017

Andrew Gelman — When considering proposals for redefining or abandoning statistical significance, remember that their effects on science will only be indirect!


Summary: The end-in-view is doing good science and avoiding junk science, which is proliferating. Adjusting standards, etc. are only means to an end. There are no silver bullets or magic wands. Doing good science depends on good design, accurate measurement, and replication.

Statistical Modeling, Causal Inference, and Social Science
When considering proposals for redefining or abandoning statistical significance, remember that their effects on science will only be indirect!
Andrew Gelman | Professor of Statistics and Political Science and Director of the Applied Statistics Center, Columbia University

Wednesday, September 27, 2017

Lars P. Syll — Time to abandon statistical significance

As shown over and over again when significance tests are applied, people have a tendency to read ‘not disconfirmed’ as ‘probably confirmed.’ Standard scientific methodology tells us that when there is only say a 10 % probability that pure sampling error could account for the observed difference between the data and the null hypothesis, it would be more ‘reasonable’ to conclude that we have a case of disconfirmation. Especially if we perform many independent tests of our hypothesis and they all give about the same 10 % result as our reported one, I guess most researchers would count the hypothesis as even more disconfirmed.
We should never forget that the underlying parameters we use when performing significance tests are model constructions. Our p-values mean nothing if the model is wrong. And most importantly — statistical significance tests DO NOT validate models!
Lars P. Syll’s Blog
Time to abandon statistical significance
Lars P. Syll | Professor, Malmo University

Tuesday, September 26, 2017

Abandon Statistical Significance — Blakeley B. McShane, David Gal, Andrew Gelman, Christian Robert, and Jennifer L. Tacket

Abstract

In science publishing and many areas of research, the status quo is a lexicographic decision rule in which any result is first required to have a p-value that surpasses the 0.05 threshold and only then is consideration—often scant—given to such factors as prior and related evidence, plausibility of mechanism, study design and data quality, real world costs and benefits, novelty of finding, and other factors that vary by research domain. There have been recent proposals to change the p-value threshold, but instead we recommend abandoning the null hypothesis significance testing paradigm entirely, leaving p-values as just one of many pieces of information with no privileged role in scientific publication and decision making. We argue that this radical approach is both practical and sensible.
Uncritically adopting universal rules and criteria is a sign of lazy thinking and likely ideological thinking aka dogmatism as well.

This move would overturn the existing scientific publishing model, it is unlikely to happen without considerable opposition. This model is key in establishing reputational credibility and advancement in the profession. Players like set rules. This is especially true in formal subjects, where training focuses on producing "the right answer" based on customary application of formal methods. The downside is group think and imposition of a consensus reality.

Abandon Statistical Significance
Blakeley B. McShane, David Gal, Andrew Gelman, Christian Robert, and Jennifer L. Tackett

Monday, September 4, 2017

Andrew Gelman — Rosenbaum (1999): Choice as an Alternative to Control in Observational Studies

Paul Rosenbaum’s 1999 paper “Choice as an Alternative to Control in Observational Studies” is really thoughtful and well-written. The comments and rejoinder include an interesting exchange between Manski and Rosenbaum on external validity and the role of theories....
Important in the most studies in social science, including economics, are necessarily observational rather than experimental. The question is how to design observational studies to make them as close as possible to experimental studies where tight control of variables is available.

Design problems involves choice that are implicit assumptions. Designers need to carefully choose (assume) consciously and intentionally rather than presume, which runs the risk of hidden assumptions that might have been avoided through greater advertence.

A good example is the Reinhart and Rogoff historical study on the effect of public debt that was vitiated by inadvertence to the different consequences of public debt under different monetary systems. MMT economists immediately pointed out that the presumption that all public debt is the same in its effects is false, owing to operational differences under different monetary regimes historically. This is actually more significant than the computational errors that were discovered subsequently and highly publicized in the media.

The R&R study was highly influential in policy formulation even though MMT economists had pointed out at the time of its release, and this led to very damaging effects when policy based on the study was implemented. This should not have happened in a professional environment.

Statistical Modeling, Causal Inference, and Social Science
Rosenbaum (1999): Choice as an Alternative to Control in Observational Studies
Andrew Gelman | Professor of Statistics and Political Science and Director of the Applied Statistics Center, Columbia University

Monday, August 21, 2017

Andrew Gelman — Publish your raw data and your speculations, then let other people do the analysis: track and field edition

There seems to be an expectation in science that the people who gather a dataset should also be the ones who analyze it. But often that doesn’t make sense: what it takes to gather relevant data has little to do with what it takes to perform a reasonable analysis. Indeed, the imperatives of analysis can even impede data-gathering, if people have confused ideas of what they can and can’t do with their data.
I’d like us to move to a world in which gathering and analysis of data are separated, in which researchers can get full credit for putting together a useful dataset, without the expectation that they perform a serious analyses. I think that could get around some research bottlenecks.
It’s my impression that this is already done in many areas of science—for example, there are public datasets on genes, and climate, and astronomy, and all sorts of areas in which many teams of researchers are studying common datasets. And in social science we have the NES, GSS, NLSY, etc. Even silly things like the Electoral Integrity Project—I don’t think these data are so great, but I appreciate the open spirit under which these data are shared....
Statistical Modeling, Causal Inference, and Social Science
Publish your raw data and your speculations, then let other people do the analysis: track and field edition
Andrew Gelman | Professor of Statistics and Political Science and Director of the Applied Statistics Center, Columbia University

Thursday, August 17, 2017

Noah Smith — "Theory vs. Data" in statistics too


Important.

I think Noah has this right. Fit the tool to the job, rather than the job to the tool.

Aristotle defined speculative knowledge in terms of causal explanation. This definition stuck although Aristotle's analysis of causality did not.
In the Posterior Analytics, Aristotle places the following crucial condition on proper knowledge: we think we have knowledge of a thing only when we have grasped its cause (APost. 71 b 9–11. Cf. APost. 94 a 20). That proper knowledge is knowledge of the cause is repeated in the Physics: we think we do not have knowledge of a thing until we have grasped its why, that is to say, its cause (Phys. 194 b 17–20). Since Aristotle obviously conceives of a causal investigation as the search for an answer to the question “why?”, and a why-question is a request for an explanation, it can be useful to think of a cause as a certain type of explanation. (My hesitation is ultimately due to the fact that not all why-questions are requests for an explanation that identifies a cause, let alone a cause in the particular sense envisioned by Aristotle.) — Stanford Encyclopedia of Philosophy
There is a distinction between reasons and causes. Some types of explanation seek only reasons, while other seek causes. Causation subsequently came to be viewed in terms of articulating mechanisms or lines of transmission (models) that are substantiated in evidence.

Explanation by reasons is different since the strict criterion of articulating mechanisms or lines of transmission that can be checked against evidence is not required.

Explanation by reasons rather than strictly by establishing causation is based on the principle of sufficient reason, which is usually credited to Spinoza and Leibnitz.

In philosophical logic, two negative criteria are foundational. Valid reasoning is vitiated by 1) arguing in a circle and 2) infinite regress.

Without recourse to checking against evidence there is no stopping point in assigning causes other than stipulation, e.g. of a first cause.

However, there may be a reason for a stopping point that doesn't involve causality based on evidence from observation or only stipulation, for example, principles that are "self-evident" based on intuition such as Aristotle's conception of intellectual intuition, or Kant's synthetic a priori propositions as articulated in the Critique of Pure Reason

On the other hand, Hume argued that causality is merely over-interpretation of constant correlation, there being no knowledge of the world other than that based on sense data. There is no observable causal link.

Cutting to the chase, scientific explanation based on causality is grounded in models that articulate causal mechanisms or lines of transmission that show how things change invariantly, which is the basis for deterministic functions. Where this is not possible, then there are two other avenues. The first is explanation by giving reasons, which is the domain of speculative philosophy. The second is employing statistics to explore patters of correlation. The question then is to what degree causal models can be gained from statistical methods, or whether it is possible at all. 

This is the issue that Noah Smith's post is getting at.

Noahpinion
"Theory vs. Data" in statistics too
Noah Smith | Bloomberg View columnist