1. Set up a model.
2. Derive an implication from your model.
3. Select/create a data set.
3a. Modify/transform the data set according to assumptions from your model. (optional)
4. Apply causal inference tests.
5. If the result is consistent with the implication from Step 2, claim support for your model.
5a. If the result is not consistent, keep it secret and then go back and tweak the model or the data set. Rinse and repeat until the result is consistent.
For the vast majority of the economics profession, this is regarded as a scientific procedure. Richard Feynman would beg to differ.
I found this choice RF quote from Paul Romer’s post about Feynman Integrity:
It’s a kind of scientific integrity, a principle of scientific thought that corresponds to a kind of utter honesty–a kind of leaning over backwards. For example, if you’re doing an experiment, you should report everything that you think might make it invalid–not only what you think is right about it: other causes that could possibly explain your results; and things you thought of that you’ve eliminated by some other experiment, and how they worked–to make sure the other fellow can tell they have been eliminated.
Details that could throw doubt on your interpretation must be given, if you know them. You must do the best you can–if you know anything at all wrong, or possibly wrong–to explain it. If you make a theory, for example, and advertise it, or put it out, then you must also put down all the facts that disagree with it, as well as those that agree with it. There is also a more subtle problem. When you have put a lot of ideas together to make an elaborate theory, you want to make sure, when explaining what it fits, that those things it fits are not just the things that gave you the idea for the theory; but that the finished theory makes something else come out right, in addition.You don’t have to look hard to see that Feynman’s view of science is rather far removed from usual econometric practice. Note in particular the obligation of the research to report “other causes that could possibly explain your results”. If there are plausible theories other than yours that “are consistent with” those little significance asterisks you’re so proud of, you need to specify them. The more of them there are, and the more plausible they are, the less claim your particular model has on our acceptance. Of course, there’s also a responsibility to report all the empirical strategies you tried that didn’t give you the results you were looking for. These are not “blind alleys”; they are possible disconfirmations, and you owe it to yourself and your readers to report them and explain why you think their negative verdicts should be set aside—if in fact they should.
Finally, Feynman’s subtle problem is familiar to anyone who reads widely in the econometric literature. The researcher encounters a problem, creates a theory to explain the problem and then tests the theory (or tries to produce results “consistent with” it), and when it works claim a sort of victory. But at a deep level this is a type of overfitting that impedes the ultimate purpose of scientific investigation, to develop an understanding of the world we can rely on in new situations.
Not all econometric work is guilty of the sins Feynman describes. There’s lots of good stuff out there! But there’s also a lot of deceptive stuff and no filter that tries to uphold scientific standards.