Every empirical researcher knows that randomized experiments have major advantages over observational studies in making causal inferences. Randomization of subjects to different treatment conditions ensures that the treatment groups, on average, are identical with respect to all possible characteristics of the subjects, regardless of whether those characteristics can be measured or not. If the subjects are people, for example, the treatment groups produced by randomization will be approximately equal with respect to such easily measured variables as race, sex, and age, and also approximately equal for more problematic variables like intelligence, aggressiveness, and creativity.
In nonexperimental studies, researchers often try to approximate a randomized experiment by statistically controlling for other variables using methods such as linear regression, logistic regression, or propensity scores. While statistical control can certainly be a useful tactic, it has two major limitations. First, no matter how many variables you control for, someone can always criticize your study by suggesting that you left out some crucial variable. (Such critiques are more compelling when that crucial variable is named). As is well known, the omission of a key covariate can lead to severe bias in estimating the effects of the variables that are included. Second, to statistically control for a variable, you have to measure it and explicitly include it in some kind of model. The problem is that some variables are notoriously difficult to measure. If the measurement is imperfect (and it usually is), this can also lead to biased estimates. So in practice, causal inference via statistical adjustment usually runs a poor second to the randomized experiment.