Archive for January 20th, 2008

Pro Bono Statistics: 5 problems with the science of the IFHS study

The Pro Bono Statistics blog has some excellent pieces on the new NEJM Iraq Family Health Survey estimate of violent mortality. PBS raises several issues. First (s)he finds a correlation of .94 between governorate (province)) population size and sample size, which apparently contradicts the published description of the sampling procedure.

PBS also finds fault with the way the IFHS dealt with missing data, by extrapolating from Iraq Body Count data for two governorates.

While the more detailed postings described above are important to read, I’ll reproduce here a summary of five issues with IFHS raised by PBS. Evidently, PB is in the process of writing separate postings on each issue:

5 problems with the science of the IFHS study

Reviewing the IFHS study, I found 5 problems with the science of the study. I believe that taken together (but particularly the first three points, regarding the crucial role extrapolation plays in arriving at the estimates in the study, and regarding the ratio of under-reporting) those problems should be seen as grave. At the very least, they should be seen as putting the findings of the IFHS on equal or inferior footing to those of Burnham et al., rather than as being on superior footing due to the nominal large size of the sample in the IFHS.

I now give a brief abstract of the five problems. As I write a fuller description of each, I will add a link to it from the list here. Unless explicitly stated otherwise, death rates and counts below refer to violent deaths as defined by the IFHS authors.

1. Missing clusters and extrapolation using IBC numbers. The IFHS surveyors did not visit all of the clusters in their sample. Those areas that were judged to be dangerous went unsurveyed. A minority of those gaps (in Nineveh and Wasit) seem to be ignored, introducing potential bias. To fill in the rest of the gaps, the IFHS authors extrapolated from other areas. The extrapolation method was to calculate the mortality rate in all of Baghdad as a fixed factor times the mortality rate in some reference area, where the fixed factor was calculated using Iraq Body Count data. The same method (with a different factor) was applied to all of Anbar as well.

It is important to note that these extrapolations determined the total number of deaths estimated for Baghdad and Anbar. Any data that was collected within those areas was in effect ignored in calculating the death estimates. Thus the death count in Baghdad and Anbar, that together account for over 60% of the deaths in the estimate for the total, are purely a matter of extrapolation, and depend directly on the IBC extrapolation factors. To illustrate: the extrapolation factor used for Baghdad was 3.08. If instead the number was 6, that would have added about 80,000 deaths to the estimate.

The reliability of the IFHS estimate thus depends directly and substantially on certain properties holding for the IBC data (namely coverage rates which are constant across space and across political characteristics). We have no reason to assume that those properties hold, and have some reason to assume they don’t. The IFHS authors have apparently made no attempt to account for those issues - not so much even as to factor uncertainties into the size of the confidence interval.

In addition, the extrapolation method is the reason for the close resemblance, emphasized by the IFHS authors, between the IBC and IFHS breakdown of deaths by area. This resemblance is an artifact rather than a feature of the raw data and should not be seen as showing coherence between IFHS and IBC.

2. The extrapolation procedure is problematic even if the IBC extrapolation factors are assumed accurate. The extrapolation basis is the death rate in 3 reference governorates (the paper does not say exactly which, describing them only as the “three provinces that contributed more than 4% each to the total number of deaths reported for the period from March 2003 through June 2006″). Most governorates were sampled with 3 x 18 = 54 clusters each. Nineveh was sample with 72 clusters. Thus the estimate of deaths for Baghdad and Anbar (which, again, account for over 60% of the total) relies on at most 2 * 54 + 72 = 180 clusters. This number, much smaller than the nominal size of 971, is the dominant factor in determining the uncertainty of the estimate of the total (again, even if the extrapolation factor is assumed to be correct and known precisely).

This is the reason why the length of the confidence interval for the IFHS study (about 120,000 deaths) is not much smaller than that of Burnham et al. (about 370,000) despite the fact that Burnham et al. used only 47 clusters.

3. The IFHS does not account properly for uncertainty in under-reporting. In the same way that the IFHS estimate depends on the extrapolation factor, it depends on the assumed under-reporting factor. The justification for the factor used seems slim (I have not made an attempt to follow the reference given). Even accepting their assumptions - i.e., treating the proportion being reported as a normal variable with mean 0.65 and standard deviation of about 0.075, the authors fail to properly account for the uncertainty in the under-reporting in their calculation of the confidence interval of the estimate of the death rate. A proper accounting would increase the size of the confidence interval by about 25%.

4. In the IFHS paper, the heading “violent deaths” does not include certain types of injuries. I could not find this mentioned in the paper itself, but table 3 in the supplementary material and a statement by WHO official indicate that car accidents and “unintentional injuries” are not included in the estimate. This may seem reasonable a-priori regarding car-accidents, and to a lesser extent regarding unintentional injuries. However, contrary to the statement, those two categories account for more than a third of the deaths by injury in the survey. Also, there has been a dramatic increase in both of those categories as compared to pre-war rates. Under those circumstances, it appears unjustified to exclude these categories from the estimate. Including them in the estimate would increase it by more than 50%.

5. The last point is more of an indication of trouble (either in the methodology of the survey or in the way it is described in the paper) than a specific problem with the estimation. According to the description of the sampling method, 10 households were surveyed in each cluster, and there were (with few exceptions) 3 x 18 = 54 clusters per governorate. In such a set-up there should be no correlation between the number of people surveyed in each governorate and the size of the population in the governorate. However, looking at table 2 in the supplementary material of the paper, there appears to be a strong correlation between those two figures. It seems that the only way such a correlation could show up is the unlikely situation in which the size of the population in the a governorate is strongly correlated with the average household size in the the governorate.

Add comment January 20th, 2008

Tirman: Implications of Iraq mortality studies

John Tirman, the director of MIT’s Center for International Studies, which funded the second Lancet Iraq mortality study, has an Op Ed in the Boston Globe this weekend on the implications of the multiple Iraq mortality studies. [One assumes that this is, in part at least, a response to the recent despicable and dishonest right wing hatchet jobs on the Lancet 2 study in the National Review, and by Globe columnist Jeff Jacoby]:

The murky toll of the Iraq war

By John Tirman
January 19, 2008

ONCE AGAIN, a controversy has erupted over how many people are being killed in Iraq. It’s an important debate, not only for beleaguered Iraqis, but for Americans seeking stability and a timely exit.

Mortality figures alone can tell a compelling story. Add to that other numbers that fill in our understanding even more - such as the scale of the flow of refugees or the women widowed by the war - and we have useful information.

So what are these statistics, and what do they tell us about this nearly five-year-old conflict?

Two kinds of accounts have emerged on the question of mortality. One is a literal count, body by body, from reports in the English language press. Because the media, mostly based in Baghdad, cannot grasp most of the violence, this is an undercount (now about 84,000) even by the reckoning of its authors, the UK-based Iraq Body Count.

The second method is to go out and ask the question in surveys of randomly selected households. This has been done five times under very dangerous conditions. Surveys of this kind during war are relatively new, and, as a result, it’s not surprising that the numbers they’ve produced have varied. But there is significant congruence.

The surveys agree that mortality is much higher than is typically held in political discussions about Iraq. The highest figure, from Opinion Business Research, a private survey firm in London, is 1.2 million through August 2007. It is also the most recent.

About 15 months ago, a survey commissioned by my center at MIT and published in The Lancet found that 601,000 had died by violence through June 2006. This figure has created a firestorm of criticism, but the methods are sound and none of the many peer reviews found anything greatly amiss. (One recalculation brought the death-by-violence total down to 450,000.)

Then last week, Iraq’s Ministry of Health released its large survey, also ending in June 2006, finding that 151,000 had died by violence. But their data tables show an enormous “excess death” total of nearly 400,000 caused by the war, and a peculiarly flat rate of violence throughout the war. Because the interviewers worked for the government, it’s likely that many respondents attributed deaths to nonviolent causes, in order to protect themselves from unwanted attention.

What to make of all this? The first conclusion is that hundreds of thousands of people have died as a result of the war - this seems incontrovertible. It is buttressed by the large number of displaced - some 3 million to 3.5 million caused by the war - and a reported total of 500,000 war widows.

The second conclusion, which helps us understand the violence, is that such a human catastrophe accounts for the insurgency in ways that no other explanation does. Whatever one makes of these insurgents, they appear to be fighting to defend their towns and tribes (apart from Al Qaeda’s foreign operation). Violence begets violence, especially when foreigners are involved.

The third conclusion is that Iraq’s devastation runs deep and wide. A generation of young men is being wiped out. Many of the most educated have left. The poverty of widespread widowhood may become chronic. The healthcare system is in shambles. Neighborhoods and towns ethnically cleansed means long-lasting displacement for tens of thousands. The humanitarian aid challenge is vast, and will last for many years.

How this affects US strategy is complex, of course, but two things stand out. First is that strategies to reduce violence against civilians and to increase economic and physical security are paramount. US leaders seem to grasp this, but their actions (arming Sunni militias, for example) may prove foolhardy.

Second, Iraq’s neighbors must be part of the solution, given the scale of misery. President Bush has never embraced this idea, but it seems more and more obvious as the war drags on. Yet on Bush’s recent trip to the region, Iraq was nearly absent from his agenda.

The lessons from the killing fields and refugees and widows won’t go away. The sooner we fully realize the scale of this catastrophe, the better we may be able to work on reconstructive remedies.

John Tirman is executive director and a principal research scientist at MIT’s Center for International Studies.

Add comment January 20th, 2008


Pages

Calendar

January 2008
M T W T F S S
« Dec   Feb »
 123456
78910111213
14151617181920
21222324252627
28293031  

Posts by Month

Posts by Category