Thursday, January 16, 2014

Is there no racial bias precisely because it seems like there is?

Consider a criminal activity that is equally prevalent in two groups, but police arrest a larger share of group A than group B. Is this evidence for or against discrimination?

In the US debate on drug policy, this is seen as evidence of racial bias. Ezra Klein pointed out recently that similar shares of african-americans and whites use cannabis, but that african-americans are arrested far more often for marijuana possession. He sees this as clear evidence of racial bias: More arrests despite equal crime rates shows this clearly.

In the academic economics literature, a 2001 paper by John Knowles and Nicola Persico in a leading economics journal (ungated here) presents an “empirical test” of “racial bias in motor vehicle searches” that flips this interpretation 100% upside down. I’ll explain the reasoning in detail below, but the model basically presents the same kind of statistical fact as evidence of a lack of discrimination: If the groups are equally law abiding despite one being searched more often, then this means that the police have targeted the “crime prone” group only up to the point where this targeting made them break the law at the same rate as others. The police do not care about color, only arrests, and they have only used color to the extent that it helped them predict probability of criminal activity (statistical discrimination). Equal underlying crime rates despite more arrests from one group shows this clearly.

Why the difference in interpretation? Well – the crucial underlying assumption in the economics paper is that groups perceive their risk of being stopped and searched, and that they respond rationally to the risk that their law breaking will be observed and punished. Given this, the argument goes through – without it, the whole thing breaks down. The paper is aware of this, writing in a “discussion” section that

Our model assumes that motorists respond to the probability of being searched. This assumption is key to obtaining a test for prejudice that can be applied without data on all the characteristics police use in the search decision. If motorists did not react to the probability of being searched, testing for prejudice would require data on c [defined as  “all characteristics other than race that are potentially used by the officer in the decision to search cars”].

Interestingly, though, while the paper does note the central importance of this assumption, it does not find it necessary to present any empirical evidence for it. This seems odd to me: It may well be a standard theoretical assumption, but this paper presents an empirical test that relies on this assumption being true. Empirical claims consistent with abstract theory, in other words, seems to have gotten a free pass from the referees in a top-5 journals in economics (Journal of Political Economy). The researchers even turn the tables and put the kind of argument that Ezra Klein represents on trial:

The argument that infers racism from this evidence relies on two very strong assumptions: (1) that motorists of all races are equally likely to carry drugs and (2) that motorists do not react to the probability of being searched. Relaxing these assumptions, as we do in this paper, leads to a very different kind of test.

The free pass that some economists give empirical claims provided they derive from “standard theory” is, admittedly, one of my pet peeves. Still – it seems kind of ballsy to say that they’re merely “relaxing” this assumption when they actually seem to be making two other claims that are equally strong:

  1. Assumption: Motorists of all groups perceive levels and changes in the objective probability that they will be stopped, and respond rationally to these so that, as a result
  2. Motorists of all races, ages, looks etc. are equally likely to carry drugs in equilibirum, which is where we as economists should think they are.

Now – a major caveat: the paper has already been cited more than 300 times, and for all I know the large literature may have kicked the tires and empirically tested all kinds of assumptions and implications from this paper. My own prior, however, would be that assumption 1 may be too strong – and I would like to see how robust the argument is to changes in this assumption. I’ll gladly admit to not having deep familiarity with this literature, but if I recall correctly from Reuter and MacCoun’s excellent book “Drug war heresies” that I read some ten years ago, people grossly exaggerate their risk of being detected for lawbreaking (e.g., for speeding violations and so on). It also seems to be difficult to find evidence that the intensity of drug law enforcement efforts has any strong effect on the prevalence and intensity of use – which has become a common argument against strict drug law enforcement and in favor of decriminalization. The “behavioral” literature and work by economists such as Kip Viscusi likewise suggests that people are poor at perceiving small risks accurately.

Even on an everyday level, it is unclear to me how individuals would get information on the probability that they will be stopped – most of us will be stopped so rarely that it is hard to estimate the risk based on our own experience, and we rarely pool the relevant quantitative information with others (“I drove a total of 70 hours last year, and was stopped by the police zero times. How about you? I’m trying to get enough observations to identify my risk of being stopped – and we seem similar enough that our data can be pooled.”)

As regards the second claim, that all subgroups in the population will break the law at the same rate, this seems too strong – but is what makes the paper’s “empirical test” for discrimination so simple: When the police have the same cost of stopping and searching single cars from different groups, then:

If the returns to searching [roughly: probability of a motorist being a lawbreaker] are equal across all subgroups distinugishable by police, they must also be equal across aggregations of these subgroups, which is what we can distinguish in the data. [emphasis in original]

The prediction, in other words, is that any group defined in terms of characteristics observable by the police has the same probability of breaking the law in equilibrium. Middle aged white dads with a station wagon full of kids, elderly ladies on their way to Bingo, etc., would all carry drugs in their car with the same probability as anyone else. This seems implausible and highly unlikely to be true. (Even less plausible, the model has every individual carrying drugs in their car some of the time, though this implication is “fixed” in the discussion section of the paper by introducing unobservable factors within each group.)

I’m (hopefully obviously) not saying that I’ve in any way disproved this theory based on these remarks, but chalk me up as unconvinced of the validity of the empirical test suggested in this particular instance.

This basically concludes my comments on the paper itself, which I present mostly as an interesting example of how economists’ belief in their theory can make them interpret things completely different from other people. Most readers may want to quit here (or wish they’d never begun reading, for all I know), but for those who want to understand the theory used by the economists I’ll try to get the main ideas across in a non-technical way:

The article argues using a model built on a simple basic idea: If you were 100% certain to be stopped and have your vehicle searched, you wouldn’t carry illegal contraband. If you were 0% certain that you will not be stopped, you would be certain to carry illegal contraband (because Homo Oeconomicus: “Hey! Profit opportunity with no risk! Why not?”).  Consequently, there is a search probability between 0 and 1 where you are indifferent between carrying and not carrying contraband, and this can be thought of as a “flip point”  where you switch from one choice to the other.

Different people will belong to different groups, which can obviously be in different situations: The payoff from carrying contraband and the cost of being punished may differ, and as a result the flipping points of different groups will differ as well. To keep the language neutral, we’ll call the groups high risk and low risk, where the high risk group needs to face a high search rate before they are discouraged from carrying contraband, whereas the low risk individuals require only a small search risk before they bow out of the illegal activity.

To see how this model plays out, imagine that you are a police officer whose only goal is to maximize the number of arrests you make. Assume also that the cost and time spent on a search is independent of who you stop and search. Somewhat unrealistically, we can imagine that we start from a situation with no search risk for any of the two groups. 

You begin by searching a little in both groups and find high rates of lawbreaking in both groups. In fact, in the model you find that everyone you stop is a lawbreaker and can be arrested. Consequently, you spend ever more of your policing time on stopping drivers. However – this alters the incentives of the drivers. As you stop ever larger shares of motorists, the low-risk motorists stop carrying contraband. The time you spend stopping them becomes worthless as you find nothing that justifies arrest. The high risk group still breaks the law, as your search rate is still below their flip point. Consequently, you don’t increase your search intensity towards low-risk motorists, but keep stopping more and more high risk motorists instead. This keeps on until you reach their flip point – at which point you stop searching higher numbers (if you did – they would all stop and you’d have no arrests).

The outcome is that the only stable equilibrium is one where each group is searched at a rate equal to their flip point. Stop them less than this, and they become more criminal and you would get a high probability of arrests per individual stopped. This would cause you to stop them more often again. Stop them more often than their flip point, and they would become less criminal and your efforts would yield a low hit-rate. The police thus act in a way that causes groups with different “criminal inclinations” to act identically in equilibrium. (This conclusion requires that the costs to the police of searching a car is the same across groups, but this is assumed throughout most of the paper and is needed for their test to work on the data they use)

A reasonable question to this equilibrium is how the contraband-carrying rate within a group is determined. In the baseline model each individual will now be indifferent between carrying and not carrying contraband in their car. How do they decide how often to do it?

If the motorists carry drugs too often, the police will find that the value of stopping more of them is positive (value of an arrest times probability of being guilty would be higher than the stop-and-search cost). If they carry drugs too rarely, the police will want to stop them less often and the motorists would basically be leaving money on the table (because Homo Oeconomicus – positive expected return from crime). Consequently, in equilibrium, the motorists in each group carry drugs with a probability that makes the police indifferent between stopping and not stopping them: The expected value of stopping them (i.e., contraband probability times value of arrest) is then equal to the expected cost (marginal stopping-and-searching cost for the relevant group).

As noted in the beginning of the post – the baseline model assumes that all individuals “flip a coin” or “toss a dice” every morning to decide whether or not to traffic drugs that day, but this can be altered by assuming unobservable characteristics that differ within each group making some of them more and some of them less crime prone.

Friday, March 8, 2013

Cannabis, IQ and socio-economic status in the Dunedin data - an update

I’ll start with a short recap: Researchers published article august 2012 arguing that adolescent-onset cannabis smoking harms adolescent brains and causes IQ to decline. I responded with an article available here arguing that their methods were insufficient to establish a causal link, and that non-cognitive traits (ambition, self-control, personality, interests etc) would influence risks of adolescent-onset cannabis use while also potentially altering IQs by influencing your education, occupation, choice of peers etc. For various reasons, I argued that this could show up in their data as IQ-trends that differed by socioeconomic status (SES), and suggested a number of analyses that would help clarify whether their effect was biased due to confounding and selection-effects. In a reply this week (gated, I think), the researchers show that there is no systematic IQ-trend difference across three SES groups they’ve constructed. However, as I note in my reply (available here), they still fail to tell us how different the groups of cannabis users (never users, adolescent-onset users with long history of dependence etc) were on other dimensions, and they still fail to control for non-cognitive factors and early childhood experiences in any of the ways I proposed. In fact, none of the data or analyses that my article asked for have been provided, and the researchers conclude with a puzzling claim that randomized clinical trials only show “potential” effects while observational studies are needed to show “whether cannabis actually is impairing cognition in the real world and how much.” 

In light of the response, it seems I have made a very poor job of communicating my point. The researchers reduce my entire argument to a temporary effect of schooling on low-SES children, so let me try one last (?) time:

Are you strongly confident that cannabis is the only thing that systematically affects IQ after the age of 13? If so, the original research design may seem OK: Look at IQ trends for those who used a lot of cannabis and compare to IQ trends of those who used little. If nothing else systematically affects IQ, there is no need to know how similar or different these groups are in other ways. It is irrelevant, just as we don’t need to know the color of falling ball to calculate its speed.

However, if you think other things may or are likely to affect IQ after the age of 13, such as education, genes, early childhood experiences, then we need to know more about the people who used a lot and the people who used only a little cannabis or none at all. In my original article I gave several references supporting the claim that IQ-trends are affected by environment. I also noted that past research has found the heritability of IQ to increase with age. A common interpretation of this is that our genes influence our non-cognitive traits. As long as we are young, we usually have to live with our parents and attend the neighborhood school. As we age, our non-cognitive traits have an increasingly strong effect on where we end up - what environment we are in, what friends we have, what activities we participate in etc. Our genes thus influence our future environment, and the cognitive challenges from our environment influence our IQ.

In light of this, it seems (to me) reasonable to ask for information on other differences between their groups. What we know, we have to glean from other research based on the same data. Some of this was referenced in my original article, but to give two simple examples: One of the researchers once described the cannabis-dependent 21-year olds in their data as having

had a long history of anti-social behavior, going right back to when they were three years old. They were being naughty, beating up other kids in the sandpit, being disruptive, then they went to stealing milk money, then they went to beating up bigger kids in the schoolground, then they converted a car … It goes on and on and on [...] When stuff doesn’t work out right they just resort to violence.
More recently, the Dunedin data were used in research that found women with more sexual partners far more likely to become dependent on alcohol or cannabis. Women reporting 2.5 or more partners per year when 18-20 years old, were almost 10 times more likely to be dependent on alcohol or cannabis at 21. My point is that this indicates that the early-onset cannabis users who go on to dependence do differ systematically from those who start later or never use, and that these differences may be related to underlying “non-cognitive traits” that would also affect their lives, environments and thus IQ independently of their cannabis use. 

The Dunedin group apparently see such traits as irrelevant to their argument. At times, they even underplay the numbers they presented in their own article on the subject: In their response to my article, they write that “Many young cannabis users opted out of education, but that did not account for their IQ drop.” However, their original numbers indicated that education substantially affected the size of the cannabis-use effect: The differences between non-users and adolescent-onset cannabis users with long-term dependence was markedly different for people with different educational levels. This lead the authors to write that “among the subset with a high-school education or less, persistent cannabis users experienced greater decline.” As I noted in my article, the magnitude of the “effect” (IQ change of highest-exposure group minus IQ change of no-exposure group) was twice as large for those with high-school or less compared to the same effect for those with more education.
How important you think these selection issues are will of course differ with your prior beliefs about the importance of various IQ-determinants. As long as the Dunedin data remains difficult to access for other researchers, there is little I can do to examine these things myself. I suggested a number of analyses and robustness checks, but the researchers were not interested in pursuing these and reduced my argument to “school temporarily raises low-SES IQ”. 

This misinterpretation of my article’s argument is to some (a large?) extent my own fault: While my article does discuss non-cognitive traits, rising heritability of IQ, and proposes a number of analyses to cope with the complications these raise - I too often use the shorthand “SES” rather than “non-cognitive traits correlated with SES.” It would have been clearer and better if I had first discussed the importance of non-cognitive traits in general, and then introduced the hypothesis that this would show up as differing IQ-trajectories across SES groups. That would have made my alternative causal model (non-cognitive traits have increasing influence over environments as people age, and the environment you end up in influences you IQ) clearer. My bad. I tried to remedy this by running the new 500-word reply in PNAS by a number of colleagues and friends before publishing, rewriting extensively to try and make my causal model clearer while also a) explaining why I thought (wrongly, it now seems) that this would show up as differing IQ-trends for different SES groups, and b) clarifying my more general methodological points and the extent to which they still remain relevant.
While some of the cause for misinterpretation is likely due to my own communicative skills, there is also a difference in methodological attitudes at work: In empirical labor economics, researchers are very concerned with selection effects, and you need to have credible, “plausibly exogenous” variation in causal variables for your effects to be accepted as causal. In contrast, the Dunedin researchers write that randomized clinical trials only show “potential” effects while observational studies are needed to show “whether cannabis actually is impairing cognition in the real world and how much.”

To me, this sounds very odd:. We have several instances of randomized clinical trials contradicting effects identified in a large number of observational studies. Three of the most famous ones are described here (possibly gated): Hormone replacement therapy was thought to reduce female coronary heart disease risk but may actually increase it, beta-carotene seemed to reduce cancer risk in observational studies but actually increased it, and vitamin C had no effect on heart disease risk while observational studies indicated it was protective. Closer to the subject matter at hand, a 2007 meta-review of observational studies in the Lancet indicated a strong causal effect of cannabis use on schizophrenia risk. Some researchers pointed out that since increasing shares of young people had been using cannabis, this implied that the number of UK schizophrenia cases should rise strongly , but this didn’t happen and the importance of the link is now again in doubt.

What all these cases have in common is that there seemed to be convincing evidence from observational studies that there was an effect, but it turned out that the effect was largely due to subtle forms of confounding. The examples certainly do not show that, e.g., beta-carotene has a “potential” negative effect that is “actually” positive in everyday life - though that is what the argument from the Dunedin researchers seems to state. Instead, these cases show that causal inference from observational data is difficult. This is the perspective from which my argument comes. I don’t claim that the correlation observed in the Dunedin data is actually fully accounted for by non-cognitive traits. I argue that they have yet to tell us how groups defined by cannabis use patterns differ on other dimensions, and that they have yet to show us how robust their effect estimates are to “controls for causal back channels unrelated to neurotoxicity, simultaneous inclusion of multiple potential confounders, and changes to their statistical model.”

Tuesday, January 15, 2013

Did pre-release harm the research process? And what is up with this cannabis and IQ research?

Did pre-release of my PNAS paper (here, but gated) on methodological problems with Meier et al’s 2012 paper on cannabis and IQ reduce the chances that it will have its intended effect? In my case, serious methodological issues related to causal inference from non-random observational data became framed as a conflict over conclusions, forcing the original research team to respond rapidly and insufficiently to my concerns, and prompting them to defend their conclusions and original paper in a way that makes a later, more comprehensive reanalysis of their data less likely.

I understand that pre-releasing papers to journalists raises interest in the research work and allows reporters to hit the ground running. But does it also hurt science? In my case, I think the answer may be “yes”.

Others have discussed how embargoed papers affects science journalism (Ed Yong has a good write-up here), but my question is whether the research process itself might suffer - at least for some types of papers: In my case, I wrote a research paper that discussed methodological issues in a previous study and suggested that their study failed to control for possible confounders in an appropriate way. I also suggested a number of methods and analyses that might help to address these shortcomings. As the press received the embargoed paper, some of them called the original researchers and told them some guy claimed their results were wrong, and what did they have to say about that? Rather than reflect on my reasoning and suggestions, this forced the original researchers to react in step with the news cycle: Within something like 24 hrs of the time they first saw my study, they released a statement to the press (available here) where they brushed off my points with reference to some new analyses. In my opinion, the new analyses they referred to were both insufficient to address my points and difficult to assess since they were presented with no details. Some of them, such as the claim that average IQ change was zero within each of three SES levels they constructed, were quite interesting and merit closer review. Other points, such as the claim that there was a relationship even within mid-SES individuals (they didn’t report whether the effects were the same or smaller) have more limited relevance (see below). However, it seemed (at the time) urgent and important to respond before the journalists “went to press,” and I ended up writing a hastily written reply so that I had a response I could make available to the journalists. This - it seems to me - is not conducive to good scientific dialogue. Not only should it be possible to breathe and think about things before pressing “send,” but the discussion can easily veer off into the issues of concern to journalists rather than the more important methodological issues that should be of concern.

Before continuing, let me be clear that my point is not to criticize the science journalists: It is natural (and correct) for them to ask the original authors for a response, and several of the reporters I was in touch with pleasantly surprised me with their level of detail, intellectual curiousness and incisive questions. To some extent it may well be that the embargo just exacerbates the issue, and that the main “problem” (or challenge)  is the massive media interest and the quick-response demands that this creates. I don’t have any clear conclusions as to how this could be improved, but I want to note how this may have affected the research debate on IQ and cannabis use.
The various claims flying around are reported in a number of news articles, now that the press embargo has lifted (just google: meier rogeberg cannabis). The response from the original research team has been made available on-line by two of the original researchers, and the lead author of the original study has used it as the basis for an online piece stating that I am flat-out just wrong. There is thus a risk that the researchers have painted themselves into a corner psychologically: By defending their original claim and methodology rather than being open to a proper re-examination of the evidence, it has become more difficult for them to do a fair analysis later without losing face if their original effect estimates were exaggerated or turn out to be non-robust.
I find this a bit disappointing, as well as sad. If the original conclusions were correct, they would hold up in the new analyses I proposed - leaving their conclusions all the more strong as a result. If their effect was overestimated (due to confounding) or even negligible or zero after better controls, surely that should be seen as a positive outcome as well: More important than what results we get is, after all, making sure that our results are as correct and credible as we can make them.
To explain why this matters, let me try to get the important methodological issues across in a clear way to those who are interested: Basically, the original paper (which is available here) used a simple variant of a difference-in-differences analysis. The researchers sorted people into groups according to whether or not they had used cannabis and according to the number of times they had been scored as dependent. They then compared IQ-changes between age 13 and 38 across these groups, and found that IQ declined more in the groups with heavier cannabis-exposure. The effect seemed to be driven by adolescent-onset smokers, and it seemed to persist after they quit smoking.
The data used for this study was stunning: Participants in the Dunedin Study, a group of roughly 1000 individuals born within 12 months of one another in the city of Dunedin in New Zealand, had been followed from birth to age 38. They had been measured regularly and scored on a number of dimensions through interviews, IQ tests, teacher and parent interviews, blood-samples etc, and are probably amongst the most intensively researched people on the planet: The study website states that roughly 1100 publications have been based on the sample so far, which is more than one publication by participant on average ;)
Despite this impressive data, there were some things I found wanting in the analysis. My own experience with difference in differences methods comes from empirical labor economics, and this experience had led me to expect a number of robustness checks and supporting analyses that this article lacked. This is not surprising: Different disciplines can face similar methodological issues, yet still develop more or less independently of each other. In such situations, however, there will often be good reasons for “cross-pollination” of practices and methods. For instance, experimental economics owes a large debt to psychology, and the use of randomized field trials in development and labor economics owes a large debt to the use of randomized clinical trials in medicine.
The cannabis-and-IQ analysis basically compares average changes in IQ across groups with different cannabis use patterns. Since we haven’t randomized “cannabis use patterns” over the participants, we have an obvious and important selection issue: The traits or circumstances that caused some people to begin smoking pot early, and that caused some of these to become heavily dependent for a long time, can themselves be associated with (or be) variables that also affect the outcome we are interested in. The central assumption, in other words, is that the groups would have had the same IQ-development if their cannabis use had been similar. Since this is the central assumption required for this method to validly identify an effect of cannabis, it is crucial that the researchers provide evidence sufficient to evaluate the appropriateness of this assumption. To be specific, and to show what kind of things I wanted the researchers to provide, you would want to:
  • Establish that the units compared were similar prior to the treatment being studied - e.g., provide a table showing how the different cannabis-exposure groups differed prior to treatment on a number of variables.
  • Establish a common trend - Since the identifying assumption is that the groups would have had the same development if they had had the same “treatment”, then clearly the development prior to the treatments should be similar. In the Dunedin study, they measured IQ at a number of ages, and average IQ changes in various periods could be shown for each group of cannabis users.
  • Control for different sets of possible confounders. To show that the estimates that are of interest are robust, you would want to show estimates for a number of multivariate regressions that control for increasing numbers (and types) of potential confounders. The stability of the estimated effect and their magnitude can then be assessed, and the danger of confounding better evaluated: What happens if you add risk factors that are associated with poor life outcomes (childhood peer rejection, conduct disorders etc), or if you include measures of education, jailtime, unemployment, etc.? If the effect estimate of cannabis on IQ changes a lot, then this suggests that selection issues are important- and that confounders (both known and unknown) must be taken seriously. Adding important confounders will also help estimation of the effect we are interested in: Since they explain variance within each group (as well as some of the variance between the groups), they help reduce standard errors on the estimates of interest.
  • Establish sensitivity of results to methodological choices. Just as we want to know how sensitive our results are to the control variables we add, we also want to know how sensitive they are to the specific methodological choices we have made. In this instance, it would be interesting to allow for pre-existing individual level trends: Assume that people have different linear trends to begin with. To what extent are these differing pre-existing trends shifted in similar ways by later use patterns of cannabis? By adding in earlier IQ-measurements for each individual (which are available from the Dunedin study), such “random growth estimators” would be able to account for any (known or unknown) cause that systematically affected individual trajectories in both pre- and post-treatment periods. Another example is the linear trend variable they use for cannabis exposure, which presumably gives a score of 1 to never users, 2 to users who were never dependent, 3 to those scored as dependent once and so on. This is the variable that they check for significance - and it would be
  • Provide other diagnostic analyses, for instance by considering the variance of the outcome variable within each treatment group (how much did IQ change differ within each treatment group?). In this way, we could tell whether we seemed to be dealing with a very clear, uniform effect that affects most individuals equally, or whether it was a very heterogeneous effect whose average value was largely driven by high-impact subgroups.
  • Discuss alternative mechanisms. What potential mechanisms can be behind this, and what alternative tests can we develop to distinguish between these? For instance, let us say you identify what seems to be a causal effect of cannabis use and dependency, but its magnitude is strongly reduced (but not eliminated) when you add in various potential confounders. For instance, educational level. As the authors of the original paper note (when education turns out to affect the effect size), education could be a mediating factor in the causal process whereby cannabis affects IQ. However, this would mean that the permanent, neurotoxic effect they are most concerned with would be smaller, because part of the measured effect would be due to the effect of cannabis on education multiplied by the effect of this education on IQ. The evidence thus suggests that the direct “neurotoxic” effect is only part of what is going on. It also suggests that we might want to look for evidence to assess how strongly cannabis use causally affects education, to better understand the determinants of this process. For instance, even if there was only a temporary effect of cannabis on cognition, ongoing smoking would do more poorly in school or college, which might then influence later job prospects and long term IQ. The effect doesn’t even have to be through IQ: If pot smoking makes you less ambitious (either because of stoner subculture or psychological effects), the effect may still have long term consequences by altering educational choices and performance. Put differently: If the mechanism is via school, then even transitory effects of cannabis becomes important when they coincide with the period of education.
When I originally started looking into this last August, I sent an e-mail to the corresponding author asking for a couple of tables with information on “pre-treatment” differences between the exposure groups. I did not receive this. This is quite understandable, given that they were experiencing a media-blitz and most likely had their hands full. I therefore turned to past publications on the Dunedin cohort to see if I could find the relevant information there.

It turned out that I could - to some extent. Early onset cannabis use appeared to be correlated with a number of risk factors, and these risk factors were also correlated with poor life outcomes (low and poor education, crime, income etc.). The risk factors were also correlated with socioeconomic status.

The next question was whether these factors could affect IQ. One recent model of IQ (the Flynn-Dickens model) strongly suggested they would. The model sees IQ as a style or habit of thinking - a mental muscle, if you like - which is influenced by the cognitive demands of your recent environment. School, home environment, jobs and even the smartness of your friends are seen as in a feedback loop with IQ: High initial IQ gives you an interest in (and access) to the environments that in turn support and strengthen IQ. Since the risk factors mentioned above would serve to push you away from such cognitively demanding environments, it seemed plausible that they would affect long term IQ negatively by pushing you into poorer environments than your initial IQ would have suggested.

A couple of further parts to this potential mechanism can be noted (both discussed here): It seems that high-SES kids have a higher heritability of IQ than low-SES kids, which researchers often interpret as due to environmental thresholds: If your environment is sufficiently good, variation in your environment will have small effects on your IQ. If, however, your environment is poorer, similar variation will have larger effects. Put differently: The IQ of low-SES kids is more affected by changes to their environment than that of high-SES kids.

Also, there is a (somewhat counterintuitive, at first glance) result which shows that average IQ heritability increases with age. One interpretation of this is that our genetic disposition causes us to self-select or be sorted into specific environments as we age. The environment we end up with is therefore more determined by our genetic heritage than our childhood environment, where our family and school were, in some sense, “forced environments.”

In my research article, I refer to various empirical studies supporting these mechanisms and their effects. For instance, past studies that find SES, jailtime, and education to be associated with the rate of change in cognitive abilities at different ages. Putting these pieces together, the risk factors that make you more likely to take up pot smoking in adolescence, and that raise your risk of becoming dependent, also shift you into poorer environments than your initial IQ would predict in isolation. Additionally, these shifts are more likely for kids in lower-SES groups (since the risk factors are correlated with SES), and these also have an IQ more sensitive to environmental changes. Finally, for the same reason, the forced environment of schooling is likely to raise childhood IQ more for the low SES kids (because it is a larger improvement on their prior environments, and because their IQs are more sensitive to environmental influences). SES, then, is in some sense a summary variable that is related to a number of the relevant factors, in that low SES

  1. correlates with risk factors that influence, on the one hand, adolescent cannabis use and dependency and, on the other hand, poorer life outcomes, and
  2. signals a heightened sensitivity to environmental factors (the SES-heritability difference in childhood)
  3. probably reflects the magnitude of the extra cognitive demands imposed by school relative to home environment

For these reasons, SES seemed like a good variable to use in a mathematical model to capture these relationships. However, it should be obvious from my description of this mechanism that we should expect the mechanism to work even within a socioeconomic group: Even within this group, those with high levels of risk factors will experience poorer life outcomes, which may reduce their IQs. They will also most likely have higher probabilities of beginning cannabis smoking. At the same time, we would expect a smaller effect within a specific socioeconomic group than we would across the whole population.

However, I simplified this by using SES in three levels and created a mathematical model with these effects, using effect sizes drawn from past research literature where I could find it. Using the methods used in the original study, I tested my simulated data and found the statistical methods identified the same type and magnitude of effects here as they had in the actual study data. This, of course, does not prove or establish that there is no effect of cannabis on IQ. What it does is to show that the methods they used were insufficient to rule out other hypotheses, that the original effect estimates may be overestimated, and that we need to look more deeply into the matter, using the kind of robustness checks and specification tests I discussed above.

In my mind, this should be just the normal process of science - an ongoing dialogue between different researchers. We know that replication of results often fail, and that acting on flawed results can have negative consequences (see here for an an interesting popular science account of one such case). A statistical model by medical researcher Ioannides (at the centre of this entertaining profile) suggests that new results based on exploratory epidemiological studies of observational data will be wrong 80% of the time. The Dunedin study on cannabis and IQ would, it seems, fit into this category. After all, by the time you’ve published more than 1100 papers on a group of individuals, it seems relatively safe to say that you have moved into “exploratory” mode.

In light of this, critically assessing results and methods and proposing alternative explanations and further tests should be an everyday and expected part of research work. Such work is particularly important in cases like the Dunedin study, where the data involved is both costly and time consuming to construct, and thus very rare. As noted recently by Gary Marcus in a G+ comment (second comment here), flawed results based on such data is likely to persist for a “really long time” if we are to wait for other researchers to replicate the analyses on other data.

And that, finally, brings us back to the end. I remain hopeful that the original researchers will return to their data and address my methodological points properly: How robust and credible is the effect, and how sensitive is the effect magnitude to different sets of controls and methodological choices. However, I am wary that the pre-release to the press and the quick back-and-forth exchanges and position-taking this seems to have caused have reduced the likelihood of this taking place.

Wednesday, November 23, 2011

Economic models in theory and practice

Michael Woodford has a nice essay at INET where he responds to John Kay’s plea for a changed economics. In it, Woodford presents a number of arguments in favor of economic models that I think are valid and useful, but I don’t think he successfully defends the way economists use models in practice. Instead, he defends a different way of using models that would be more defensible.
Let’s consider his arguments in favor of mathematical models.
Argument 1: Precision
Models allow the internal consistency of a proposed argument to be checked with greater precision;
True, if the argument allows for translation into mathematical form. There’s an old Keynes quote on this:
Much economic theorizing to-day suffers, I think, because it attempts to apply highly precise and mathematical methods to material which is itself much too vague to support such treatment.
Sometimes, surely, the “vagueness of the material” is a shortcoming that makes an argument sound more sensible than it is. In that case, forcing it into mathematical form forces us to clarify what we actually mean and makes it harder to “weasel.” In other cases, however, formal methods force us to “sharpen” assumptions in a way that changes the argument itself. There’s a difference between saying that people have some thoughts on how gasoline prices will be in the future, and saying that people have subjective beliefs about future gas price trajectories that can be defined as a probability distribution over all possible price paths.
Argument 2: Differentiation/clarification
Back to Woodford:
they allow more finely-grained differentiation among alternative hypotheses
This is true – in so far as all the alternative hypotheses can be translated into this common language. For instance, the philosopher Jon Elster has written on Gary Becker’s rational addiction work that
Although I disagree sharply with much of it, it has raised the level of discussion enormously. Before Becker, most explanations of addiction did not involve choice at all, much less rational choice. By arguing that addiction is a form of rational behavior, Becker offers other scholars the choice between agreeing with him or trying to identify exactly where he goes wrong. Whatever option we take (I'm going to take the second), our understanding of addiction will be sharpened and focused.
This sounds fine, until you try to read the literature and discussions and realize that economists rarely find it interesting (or possible?) to discuss the specification of a model taken as a serious hypothesis about a causal mechanism in the world. As Woodford says elsewhere in his essay, when you want to conduct economic analysis with a mathematical model,
An assessment of the realism of the assumptions made in the  model is essential --- not, of course, an assessment of whether the model literally describes all aspects of the world, which is never the case, but an assessment of the realism of what the model assumes about those aspects of the world that the model pretends to represent. It is also important to assess the robustness of the model’s conclusions to variations in the precise assumptions that are made, at least over some range of possible assumptions that can all be regarded as potentially of empirical relevance. These kinds of critical scrutiny are crucial to the sensible use of models for practical purposes.
However, in many parts of economics, this kind of discussion is seen as irrelevant or silly. If you persist in trying to discuss “the realism of what the model assumes about those aspects of the world that the model pretends to represent” you’ll just be a person who doesn’t “get” economics.
Consider the rational addiction model of Becker and Murphy and the others that followed in its wake. As I’ve written with a colleague here,
The core of the causal insight claims from rational addiction research is that people behave in a certain way (i.e. exhibit addictive behavior) because they face and solve a specific type of choice problem. Yet rational addiction researchers show no interest in empirically examining the actual choice problem – the preferences, beliefs, and choice processes – of the people whose behavior they claim to be explaining. Becker has even suggested that the rational choice process occurs at some subconscious level that the acting subject is unaware of, making human introspection irrelevant and leaving us no known way to gather relevant data
In addition: Trying to examine whether the causal mechanisms described are at all plausible or consistent with evidence, is seen as irrelevant or weird. It’s an exercise only philosophers like the above quoted Jon Elster and oddballs like myself seem to be interested in (I first wrote on this in an article called “Taking absurd theories seriously”). “Real” economists seem content to wave their hands and say “as-if” or “these are just standard assumptions.”
Argument 3: Enables complexity
[Models] allow longer and more subtle chains of reasoning to be deployed without both author and reader becoming hopelessly tangled in them.
This claim by Woodford is similar to Krugman’s claim in “Two cheers for formalism” (a piece originally published in the Economic Journal):
Most of the topics on which economists hold views that are both different from "common sense" and unambiguously closer to the truth than popular beliefs involve some form of adding-up constraint, indirect chain of causation, feedback effect, etc.. Why can economists keep such things straight when even highly intelligent non-economists cannot? Because they have used mathematical models to help focus and form their intuition.
This sounds sensible: One could argue that individuals face relatively simple problems (“how much milk do I feel like drinking now?”), but that we need formal tools to understand what happens when they interact in markets or firms or whatever. However, this wouldn’t really be true. The argument is particularly off if you’re into rational expectations – in which case you want to impose the requirement that the agents in the model understand the model they are in and optimize in light of the real constraints they face. In that case, they need the same tools you do.
The argument is also off in other contexts. As I’ve argued elsewhere, it is actually an argument against all sophisticated, mathematical theories of individual choice. If mathematical modeling is a tool necessary for economists to reason their way through, say, rational addiction theory, how on earth do they expect “even highly intelligent non-economists” to discover that becoming a junky is their best shot at happiness? You might want to avoid the question by saying that they are able to do this “subconsciously”, but even that is a testable claim (hint: it’s empirically false). Also – if we really did solve such problems easily in our subconscious – wouldn’t these models seem intuitive and in line with our gut feelings? Put differently, it seems odd to develop a tool to overcome human cognitive frailty and then claim that this tool used at “full power”
Argument 4: Enables critical evaluation
Woodford’s essay again:
Often, reasoning from formal models makes it easier to see how strong are the assumptions required for an argument to be valid, and how different one’s conclusions may be depending on modest changes in specific assumptions. And whether or not any given practitioner of economic modeling is inclined to honestly assess the fragility of his conclusions, the use of a model to justify those conclusions makes it easy for others to see what assumptions have been relied upon, and hence to challenge them.
Here I’ll just refer back to the discussion on argument 2. Once again, I agree with Woodford in principle – but would argue that this is descriptively inaccurate in terms of how academic discussions in economics are actually conducted. When I examined the justification of specific assumptions in rational addiction theories (in “Taking absurd theories seriously”), I found that this was a lackadaisical affair: The most weird and unbelievable stuff was left uninterpreted in the model, other weird assumptions were justified by telling whatever anecdote would support it, or by giving loose evidence that would support a different but related assumption. The models, to sum up, were
poorly interpreted, empirically unfalsifiable, and based on wildly inaccurate assumptions selectively justified by ad-hoc stories.
Time to wrap up.
I realize that I’ve sounded critical of Woodford, but hope the “in principle”/”in practice” distinction is clear. My problem with his essay is that it’s framed as a defense of current economic practice. Evaluated in that regard, the arguments fails: He actually defends a form of “best practice” in modeling that is neither widespread nor widely recognized as such in economics today (as far as I can tell).

Monday, November 21, 2011

Hamermesh: Macro is rubbish, but the academic market is working and selecting for usefulness. It selected for Gary Becker, didn’t it?

Labor economist Daniel Hamermesh is interviewed at the Browser and asked for five books showing that economics is fun. At one point, the following exchange occurs:
With the economics profession, in the aftermath of the financial crisis, being somewhat in disrepute…
Stop! Stop, stop, stop. The economics profession is not in disrepute. Macroeconomics is in disrepute. The micro stuff that people like myself and most of us do has contributed tremendously and continues to contribute. Our thoughts have had enormous influence. It just happens that macroeconomics, firstly, has been done terribly and, secondly, in terms of academic macroeconomics, these guys are absolutely useless, most of them. Ask your brother-in-law. I’m sure he thinks, as do 90% of us, that most of what the macro guys do in academia is just worthless rubbish. Worthless, useless, uninteresting rubbish, catering to a very few people in their own little cliques.
I’m not sure most people in the outside world would make a distinction between macro and microeconomists.
I know. It’s up to us to educate them. I got this line from a friend in architecture the other day. He said exactly the same thing. I went through the same litany, trying to disabuse him of this notion. It’s like pushing a stone up a giant hill. It’s not going to get me very far, I agree. But nonetheless it is the case that most of us, and most of what we do, remains tremendously useful, tremendously relevant, and also fun!
He also names names. While Sargent, for instance, is a good guy, 
Not all the macro guys who won the Nobel are good. The guy who won it in 2004 was one of the main culprits in the nonsense, Ed Prescott.
At the same time, Hamermesh is an optimist, in that he believes the academic market selects, over time, for usefulness:
I do believe in markets. We had some useless macro guys here who just left, thank God, and we’re now looking for replacements. I do think the failure of these people is conditioning how we search for a replacement. I’m quite sure the journals in academe are going to reflect this too. People are interested in being useful in this profession. It doesn’t mean the people who were the bad guys from the last 20 years in macro are going to be doing anything different. They’re incapable of doing anything different! But markets do work and the dead and useless get shoved aside by the young and useful. I’m a tremendous optimist. I do believe markets work and that people run to fill niches. There’s an obvious niche here, and you’re already starting to see it being filled.
I think this is interesting: Macroeconomics the last few decades has basically been run by guys that Harmermesh charges with doing mostly “worthless rubbish. Worthless, useless, uninteresting rubbish, catering to a very few people in their own little cliques.” These are the guys who’ve dominated top journals and top economics departments and who have won Nobel Prizes for their macro work. Yet he still sees the academic market as a well-functioning mechanism selecting for usefulness.
There’s a second thing I find interesting about this: The “top economists” that Harmermesh mentions to show that micro (as opposed to macro) is useful, is the same economist that I trot out to show how absurd nonsense is accepted in economics.
Together with Hans Melberg, I recently argued that there is a “market failure” in (at least a large part of) the academic market for economists: If you have a model that is theoretically consistent and in line with “standard theory” (rational choice, equilibrium, etc.), and if the model matches some stylized facts and can reproduce regularities in market data – then you’re more or less given free reign to make causal claims and say that the “theory” can support strong and important claims regarding the welfare effects of actual real world policies.
In this work, Melberg and I looked at the kind of claims made in the literature on rational addiction theory. We argue that this is a literature featuring claims so obviously unsupported (we call them “absurd”), that their acceptance into good journals is a clear indication of a “broken market.”
The funny thing is: The whole literature on rational addiction theory – which we see as a clear example of how the “academic market” in economics allows policy-useless nonsense claims to rise to the top - is based on the work of Gary Becker. This same economist is one of two economists that Hamermesh mentions as examples of good economics that, presumably, show how well-functioning the market is.
There have been some great economists since then, in the last 30 to 40 years. [..] There’s Gary Becker, who in my view is the top economist of the last 50 years. His notions of family bargaining and how families behave are terribly important, and affect how, in the end, we all think.
To me, the rise of Gary Becker and his theories does not illustrate the usefulness (in the sense of credible, well-supported insights into the real world and the effects of actual policy choices on real people) of his work, but more that it “opened new markets” for economists: He showed them ways to build theories of the kind they were familiar with within a host of new areas (education, family, crime, addiction), in ways that seamlessly fit the criteria of “rational choice” and standard micro-economic practice. He provided innovative, creative, exciting strategies for economic imperialism. His work allows you to interpret all sorts of things using the universal acid of economic theory. Some of it may be truly useful and correct, some of it is very clearly not, yet all of it has been very successful within the discipline. To me, that makes it unlikely that “usefulness” was the selection criteria involved.

UPDATE: Came across a nice blogpost by Daniel Lemire who also doubts that science is successfully self-regulatory, though he argues from a different angle (he asks: how well does peer-review filter out bad research? To what extent does citation levels reflect quality?). I could also add this post which discusses a recent result that rebuttals don't affect how often a paper is cited, nor how well it is regarded.

Friday, November 18, 2011

The invisible hand is everywhere… you just need to notice every little detail!

In the book “Darwin’s Dangerous Idea” the philosopher Daniel C. Dennett called Darwin’s theory of evolution a form of “universal acid”:
it eats through just about every traditional concept, and leaves in its wake a revolutionized world-view, with most of the old landmarks still recognizable, but transformed in fundamental ways.
The same thing is true of economic choice theories: The logic of equilibria based on rational actors making marginal adjustments started as a description of the market, ate its way into “above market”-institutions such as regulatory agencies (regulatory capture) and government (public choice), as well as “non-market”-institutions such as families and – in a nice little satirical Ourobos-move – the discipline of economics:
The way I would describe Academic Choice theory is that it is “the sociology of economists, without romance.” Is this right? What an insightful comment. As you say, Academic Choice theory is a descriptive project, with no normative orientation. We apply a critical approach in order to counterbalance pervasive earlier notions of economists as scientific heroes struggling against popular ignorance in order to serve the common good.
What would you identify as the central insights of Academic Choice theory? The theory begins by identifying three principal ways in which economists try to maximize their utility. First, they receive salaries from universities, which can be increased if their course enrollment increases. Course enrollment is primarily driven by students with future careers in business and the financial sector, so an economist has an incentive to propound theories that CEOs and financial institutions find attractive. Even if adoption of these theories leads to substantial public costs, these costs will not be shouldered by the economist personally. Second, by developing such theories an economist can open the door to future wealth as a lobbyist or consultant. Third, the support of economists is critical to creating and maintaining special privileges for the financial services industry and for top corporate officers. By threatening to withdraw this support, economists can engage in rent-seeking. I call this last practice academic entrepreneurship.
The post is wroth worth reading in full. Remember – no matter what objection someone raises, you can always turn the firehose of economic acid on them and reduce them to yet another selfishly motivated rational agent. And when the economic worldview has eaten its way through everything and laid bare the underlying logic and structure of the world in all its stark, brutal detail? Then, perhaps we’ll all meet up in the “Invisible Hand Society” of Robert Anton Wilson’s novel “Schrodinger’s Cat Trilogy”:
Dr. Rauss Elysium had summed up the entire science of economics in four propositions, to wit:
1. Find out who profits from it.
This was merely a restatement of the old Latin proverb-a favorite of Lenin's-cui bono?
2. Groups never meet together except to conspire against other groups.
This was a generalization of Adam Smith's more limited proposition "Men of the same profession never meet together except to defraud the general public." Dr. Rauss Elysium had realized that it applies not just to merchants, but to groups of all sorts, including the governmental sector.
4. Every system evolves and expands until it encroaches upon other systems.
This was just a simplification of most of the discoveries of ecology and General Systems Theory.
4. It all returns to equilibrium, eventually.
This was based on a broad Evolutionary Perspective and was the basic faith of the Invisible Hand mystique. Dr. Rauss Elysium had merely recognized that the Invisible Hand, first noted by Adam Smith, operates everywhere. The Invisible Hand, Dr. Rauss Elysium claimed, does not merely function in a free market, as Smith had thought, but continues to control everything no matter how many conspiracies, in or out of government, attempt to frustrate it. Indeed, by including Propositions 2 and 3 inside the perspective of this Proposition 4, it was obvious-at least to him-that conspiracy, government interference, monopoly, and all other attempts to frustrate the Invisible Hand were themselves part of the intricate, complex working of the Invisible Hand itself.
He was an economic Taoist.
The Invisible Hand-ers were bitterly hated by the orthodox old Libertarians. The old Libertarians claimed that the Invisible Hand-ers had carried Adam Smith to the point of self-contradiction.
The Invisible Hand people, of course, denied that.
"We're not telling you not to oppose the government," Dr. Rauss Elysium always told them. "That's your genetic and evolutionary function; just as it's the government's function to oppose you."
"But," the Libertarians would protest, "if you don't join us, the government will evolve and expand indefinitely."
"Not so," Dr. Rauss Elysium would say, with supreme Faith. "It will only evolve and expand until it creates sufficient opposition. Your coalition is that sufficient opposition at this time and place. If it were not sufficient, there would be more of you."
Some Invisible Hand-ers, of course, eventually quit and returned to orthodox Libertarianism.
They said that, no matter how hard they looked, they couldn't see the Invisible Hand.
"You're not looking hard enough," Dr. Rauss Elysium told them. "You've got to notice every little detail."
Sometimes, he would point out, ironically, that many had abandoned Libertarianism to become socialists or other kinds of Statists because they couldn't see the Invisible Hand even in the Free Market of the nineteenth century.
All they could see, he said, were the conspiracies of the big capitalists to prevent free competition and to maintain their monopolies. They, the fools, had believed government intervention would stop this.
Government intervention was, to Dr. Rauss Elysium, just like the conspiracies of the corporations, merely another aspect of the Invisible Hand.
"It all coheres wonderfully," he never tired of repeating. "Just notice all the details."

Thursday, November 17, 2011

Rational models are NOT more constrained than irrational ones

One more comment on the Raquel Fern├índez conversation at the Straddler that I mentioned in a previous post. I thought she had several good points that she formulated well, but there was one comment that I’ve often seen economists make and that I think is wrong or at least misleading:
There is a beauty to the models in and of themselves. You assume, for example, that people are rational. I don’t think any really good economist thinks that people are perfectly rational, but, on the other hand, if you want to model people as not rational, all of a sudden it’s not clear what choice you should make. There are a million and one ways to be non-rational; there’s only one way to be rational within the confines of a model. Rationality means one thing: you’re maximizing your welfare subject to constraints. Now, if you say people don’t always maximize, and they’re beset by this and that, then all of a sudden you can have a million models. And that’s a little bit unsatisfactory too.
Yes, “there’s only one way to be rational within the confines of a model,” but so what? Within the confines of a specific model of irrationality there would be only one way to be irrational too. And  yes, “there are a million and one ways to be non-rational,” but there’s also a million and one ways to specify a utility function – and this gives us a million and one ways to act that are all rational.
There are actually three points (at least) here:
  1. Strictly speaking, “utility maximization” is empirically empty. We start with a preference relation that summarizes observed choice between pairs of consumption bundles, and which is “rational” in the sense of being complete, reflexive and transitive. We can then represent this with an ordinal utility function constructed to capture the choices described by this preference relation. Any preference relation – that is, any systematic set of choices fulfilling these conditions – can be represented by such a utility function. If you always did what hurt you the most, your choices could still be captured by such a utility function – and saying that you “maximize utility” means nothing more than saying that you “choose the one option within the choice set that would be selected no matter what other alternative in the choice set you set it against in a pairwise choice”. This makes no claims concerning why this option is selected – it may be because it benefits you, is best for the world (but not for you selfishly), is the most brightly colored, was most recently advertised or whatever.
  2. Economists then commonly make the “great leap of welfare economics” by assuming that all choices actually made aim to maximize the welfare of the choosing agent. “Utility” now measures “welfare” in some way.To be “rational” means to be “smart and selfish” – and arguments about whether or not A or B or C “is rational” quickly becomes a tiresome exercise in discussing psychological egoism. “Yes, he gave away his money to the beggar – but this gave him a warm glow which was the most welfare-maximizing item he could purchase for that sum of money”
  3. People are obviously not 100% selfish in terms of money and goods for themselves, so such utility functions need to be defined over non-observable goods as well as observable goods. This means that the “one” model of fully rational choice is actually a million models, due to the many degrees of freedom within the model. You do what maximizes your “utility,” but that can be anything. Take Gary Becker’s work: In his work, your utility function can be defined over “capital stocks” that refer to addictive capital, imagination capital, human capital etc. Looking at the different variants of rational addiction theory that have been developed within Becker’s framework, economists are happy to assume different numbers of such stocks and different cross-derivatives between stocks and other goods. Out, as a result, comes “rational consumption” that is rising, falling, cyclical, chaotic, or involves cold-turkey quitting.
I really don’t understand why (some) economists think “utility maximization” is such a “hard constraint” on theorizing in light of this. If you think it is – let me know one consumption pattern or human behavior that if it were observed repeatedly would be inconsistent with “rationality” or “utility maximization”. If it’s a hard constraint this should be simple – there should be long lists of possible, observable behaviors that could not occur if people were actually rational and maximized utility in some substantive sense and that would not occur if the hypothesis of “rational selfish maximization” was correct.
In actuality, I think you’ll find that there is no behavior weird enough to make rational choice economists doubt there being some rational utility-maximizing explanation out there provided we look long and hard enough. As Stigler and Becker wrote in their De Gustibus Non Est Disputandum article:
On our view, one searches, often long and frustratingly, for the subtle forms that prices and income take in explaining differences among men and periods. […] we are proposing the hypothesis that widespread and/or persistent human behavior can be explained by a generalized calculus of utility-maximizing behavior, without introducing the qualification “tastes remaining the same".
Put differently: If you see human action that doesn't look rational - doubt not! Rationality works in mysterious ways... Believe, think, pray and tinker with your model - and if you are wise enough all will be revealed and the Invisible Hand will publish your paper in a top-ranked journal...

Wednesday, November 16, 2011

The “canonical model” and the importance of default models

A Google+ post from Al Roth alerted me to an interesting conversation at The Straddler with Raquel Fernández. She has some great ways of making some nice points, such as her statement that a:

problem [in economics] is that methodology frequently trumps the question. Once you have a way to model things, much of the research becomes very self-referential; that is, it becomes more about how the model  behaves and less about the question. I think the question really matters, but a lot of economists believe the methodology matters more than the question. And this leads to very elaborate models of very many things without much of an outside reality check.

Another interesting impression I get from her talk, which is not explicit and may be a misreading on my part, flows from this point and concerns the importance of default models: The “default” or “canonical” model of economics describes a perfect-competition well-functioning market. We know that this is an incorrect description of the world, but it frequently shapes our “gut reaction,” and because we understand it fully we feel more comfortable arguing about this model than about the world. As a result, economists who give policy advice are treated more leniently by fellow economists if their advice is consistent with the standard model.

[…] the people who go and give advice usually end up with a very bad rap in economics. I am amazed at how much hatred—and I will say hatred—Paul Krugman evokes from some fellow economists. But one of the reasons for this is that he says things for which there is not “scientific” support and which go against what these people believe is "good" economics. Now, people on the other side also say things for which they do not have "scientific" support incidentally, and they don’t get the same amount of hatred.[…]

Take the argument we’ve been having recently. Should we be trying to increase aggregate demand or should we be reducing the deficit? […] Well, a model is not going to give you the answer because it depends on whether you write the model in such a way that getting aggregate demand up is a good idea, or whether you write it in such a way that people are really worried about future deficits that are coming around the road and they won’t invest because they know that taxes are going to be high in the future.

These things are rigged into the model from the beginning when it’s such an unsettled question, and we don’t really have an exact science-based way to answer it, which is why we argue about history. […]

Economists don’t have to be free-marketers. But that ends up being the canonical model, and then everything else ends up being a departure from the canonical model, which you’ve then got to explain why you’re departing from. It’s not because the canonical model is right, it’s because you ask most economists and they’ll say, “At least we understand how that economy works very, very well. So you want to tell me that we’re going to move away from this one and move to something else, that’s fine, but you have to explain why you’re putting in all of these imperfections.” So it’s not that you can’t write those things down, it’s just that there is less of a standard way of doing it.

Friday, July 1, 2011

Have a nice summer!

Had some posts I needed to get out on the Norwegian-language blog of a colleague regarding housing prices in Norway. As a result I never got around to finishing up my thoughts on Gigerenzer´s criticism of behavioral economics (I really thought the paper was a good, enjoyable and interesting read, but my comments on this blog have so far been on things I didn´t like so much... use the search bar underneath the twitter box on the right to find the posts), nor some things I kind of want to think through regarding popularized economics, nor some ideas I would like to explore regarding the strong demand for assurance and its implications in politics and debate and academia.

Hope some of you readers will still come by once in a while next season.

Have a nice summer!

Saturday, June 18, 2011

Should parenting and drugs affect economic theory?

Would economic theory be different if economists were parents when initially taught it? Freakonomics-blogger Justin Wolfer says yes, because that´s what the experience of becoming a father has told him. Overcoming Bias-blogger Robin Hanson says no, and says becoming a father is like having a mystical experience on drugs: It doesn´t inform us about the real structure of the world.

I´m wondering if the difference between these two may reduce to one thing: How “religiously” they´ve believed in the “ultimate truth” of the economic model of rational decision making. If we take Wolfer at his own word, he always saw it as
the basic idea informing economics—that people are purposeful, analytic decision makers. And this idea just seemed entirely natural to me. I had always believed in the analytic self; I was rational, calculating, and tried to make smart decisions. Of course real people don’t use math, but I figured that we’re still weighing costs and benefits just as our models say. Or at least that was my understanding of the world.
In other words, he sounds like the kind of guy who believed all behavior could be explained by economic theory as optimal, even if some of it would require complex choice models that assume people take subtle feedback effects, strategic “he knows that I know that he knows that I know X” issues and complicated delayed consequences of present actions into account in an optimal “rational” manner. After having a kid, this no longer seems to describe his own experience of himself:
My feelings toward my daughter Matilda aren’t easily expressed in analytic terms. I struggle to express it, just as I struggle to understand it.

There’s something new and strange about all this. Today, I feel the powerful force of biology. It’s visceral; it’s real; it’s hormonal, and it’s not in our economic models. I’m helpless in the face of feelings that overwhelm me. Yes, I know that a twenty-something reader will cleverly point out that I just need to count kids as a good which yields utility, or perhaps we need to add a state variable to the utility function as in rational addiction models. But that’s not the point. I’m surprised by how little of this I’ve consciously chosen. While the economic framework accurately describes how I choose an apple over an orange, it has had surprisingly little to say about what has been the most important choice in my life.
Hanson, in contrast, seems to see economic models as attempts to capture some of the regularities in human behavior:
First, econ makes sense of a complex social world by leaving important things out, on purpose – that is the point of models, to be simple enough to understand. More important, econ models almost never say anything about consciousness or emotional mood – they don’t at all assume people choose via a cold calculating mindset, or even that they choose consciously. As long as choices (approximately) fit certain consistency axioms, then some utility function captures them. So how could discovering emotional and unconscious choices possibly challenge such models.
Given Hanson´s view of economic theory, there is no need to redefine everything after having a kid. People will still tend to buy less as the price rises, avoid risk, and so on. It surprises me somewhat, though, that Hanson doesn´t see that there are a number of economists with a more fundamentalist belief in the neoclassical model. I´ve met several, and I bet I´ve met fewer economists in general than Hanson. I´ll admit this is pure speculation, but I´ve wondered if some economists feel threatened by behavior that deviates from the “rational choice” model they hold. They don´t say "Well, this is a simplified model, sure there´ll be deviations, but we`re capturing some regularities and that´s what we´re aiming for. Explaining something is better than not explaining anything and we´ll never be able to explain everything.” Instead, they try to twist their brains into coming up with ad-hoc assumptions that would reveal these deviations to be full, sophisticated optimization. At times, this means that increasingly stupid and shortsighted behavior is explained as increasingly subtle and complex optimization. Maybe it´s a fear of letting non-rational explanations get a foot in the door, maybe it´s because the "welfare effects" often tacked on at the end of choice models would no longer be "valid" (not that they are valid today, but if you truly believe all choices always maximize the ultimate good of importance to the acting agents, then I guess they might seem valid to you).

Hanson concludes that
Having an emotional parenting experience is as irrelevant to the value of neoclassical econ as having a mystical drug experience is to the validity of basic physics. Your subconscious might claim otherwise, but really, you don’t have to believe it.
 I´m not sure. If a person sees economic theory as Hanson describes it, then I agree with him. But if a person thinks his way of seeing the world is the only one that is valid and possible (in the sense of consistent with past experiences), then having a child or a high dose of psilocybin in a controlled setting may both be ways of learning otherwise?

Tuesday, June 14, 2011

Tim Harford´s "Adapt" - a book review

Tim Harford´s new book “Adapt” is a wonderful read but difficult to pigeonhole. There´s interesting stuff about the Iraq war, the finance crisis, development aid, randomized experiments, skunk works, the design of safety systems, whistleblowers, overconfidence and groupthink phenomena, not to mention a truly wonderful explanation of how a carbon tax would work and why environmentalists should embrace it. Even when he covers topics that have been ably covered by others elsewhere, he does so in a light and enjoyable way and manages to dig up new, cool anecdotes. It´s partly a popularization of science, partly a business book, at times it almost moves into self-help territory, and at times it seems to present new and interesting perspectives on big topics (such as financial regulation). Still - though it may sound sprawling, I didn´t really find it so when reading it. At one level it reads like a series of interesting pieces of journalism on different topics, but on another, there´s an underlying thread of ideas that gradually emerges.

The way I read it, the main point of the book is that the problems we face are too complex for us to understand and figure out the solutions to from behind a desk. Evidence from the failed predictions of experts to the extinction records of firms and the failure of high-level military strategies support this. There´s a number of reasons why this is so, ranging from the difficulty of capturing and aggregating information at a sufficiently finely grained level to psychological tendencies to trust in our (frequently false) beliefs and suppress possible evidence that they´re wrong. Still - we do solve problems - but this happens through an evolutionary process: We make lots of bets - each one of which is small enough that failure is acceptable - and the winning bets identify “good enough for now” solutions that we replicate and grow. The best examples of this (as a method for human problem solving) are market economies and science. Lots of entrepreneurs who hope to strike it big, some of whom combine the factors of production in a way that better creates value than others - thus making a profit (to put the point in an Austrian way). Lots of scientists stating hypotheses, some of whom are able to better predict the outcomes of experimental and quasi-experimental data than others - thus having their hypotheses strengthened (on a related note - I recently made the argument together with a colleague that this process is broken in economics - see more on that here).

Harford also discusses a host of implications that follow from this - the need to “decouple” systems so that failure in a single component (such as a bank in the financial system) doesn´t bring down the entire system, the need to finance both “highly certain” research ideas as well as “long shot” ideas, avoiding groupthink by including people likely to disagree (thus creating room for disagreement in the group) and demanding disagreement, and using prizes to elicit experiments. He also discusses how such evolutionary processes can be exploited better in policy- which is where he gets to his beautiful explanation of how a carbon tax works by tilting the playing field (there are two chapters here that should be reworked into a pamphlet and handed out in schools and parliaments).

That´s my brief take on the underlying “storyline” - but it doesn´t do justice to the book, which reads like a string of intellectual firecrackers. The wide-ranging topics, however, also means that they are necessarily touched on lightly - it´s an appetizer for a lot of ideas more than a fully satisfying meal. For instance, if success in the market (and elsewhere) consists of being the “lucky” winner who made a bet that - ahead of time - had no stronger claim to being right than others, how does this factor into our views on entitlements and redistributive taxation? If prizes (such as the prize for a space-going flight) actually elicit large-scale, expensive experiments that we only need to pay for when they succeed - does this mean that they exploit some irrational overconfidence in the competitors? If people were sensible and unbiased in their estimate of success, would they spend more than their expected reward? And if not - wouldn´t that mean the prize money would have to be sufficient to finance all the experiments - in which case it doesn`t save us any money? To what extent does the desire for control play into the desire for top down planning and control? (Imagine you were the prime minister - would you feel comfortable if loads of schools were allowed to try out whatever they felt like, risking the chance that some of them would beat kids or indoctrinate them in some way that blew up in the media?) In an online interview by Cory Doctorow, Harford states that

I also looked at the banking crisis and big industrial accidents such as Deepwater Horizon, and found that there were almost always people who could have blown the whistle — and sometimes did — but the message didn’t get through. So those communication lines need to be opened up and kept open.

Yes - but no…. After all, if there´s a host of signals coming up, most of them wrong, it might well be rational to have some filtering mechanism in place that also weeds out many of the correct signals in order to avoid being swamped and misguided by wrong ones.

While we´re on the topic of whistleblowers - I also wish he´d said a word or two about some of the biggest transparency cases of recent years. On the one hand, the whistleblower-friendly candidate Obama who changed his tune once he got in office. This could have served as a way of discussing how hard it is to actually have people looking over your shoulder and criticizing you, even when you think (or at least see the arguments for) allowing them to do so. Also, I would have been interested in Tim Harford´s views on Wikileaks, which in some ways is the biggest attempt to increase transparency in modern times - as well as his views on the conflicts it generated (a book championing the cause of whistle-blowers should also at least mention the awful treatment of claimed whistleblower Bradley Manning). Given the many stories from the Iraq war and the US military about the dangers of a strictly enforced official partyline/strategy/story, the potential value in Wikileaks shining a light on what is actually going on seems pretty clear. Or at least worthy of discussion.

Given the number of topics covered in the book there are obviously quibbles you may have with certain facts that are wrong or the way some of them are treated, but that´s to be expected. More importantly, there were parts of the argument that I felt were missing - especially concerning how difficult it is to learn from experience. As documented in for instance Robyn Dawes´ excellent “House of cards” (in the context of psychology and the misguided beliefs of treatment professionals), there are clear cases where statistical decision rules consistently outperform human judgments, without this being enough to convince the experts who could gain from them. Or consider this post on the backfire effect from the you are not so smart blog, which discusses experiments suggesting that people can react to evidence that they were wrong by being even more convinced in their wrongness. The way politicians respond to arguments about the surprisingly weak effect of drug decriminalization on usage levels is another example. In terms of Harford´s argument - adaption and evolution not only requires us to test things and find out what works - it also requires us to accept what works and implement it more broadly. Taking into account the number of things covered, he probably covered this too - but if he wants his ideas to be taken up in policy circles I think (that is, my gut-feeling is) that this would be perhaps the hardest part.

Finally, the book could also have been tempered by applying its thesis to the thesis itself: Has “planned evolution” been attempted, and did it actually work? As the book argues, the devil is often in the details and seemingly good ideas based on solid case stories may turn out to work quite differently in practice from what we expected.