Home - Conflict Mortality Surveys|Bias Paper|Follow-up Paper|Visual Summary|FAQs|L1 versus L2|Methods
Urban Homicide Rates Around the World|Iraq Maps|Control and Danger|News|Clarifications|Presentations|About Us|Contact Us
The Two Lancet Surveys of Iraq Do Not Validate Each Other1
Les Roberts and Gilbert Burnham, authors of the second Lancet mortality survey of Iraq (L2), have asserted on a number of occasions that the mortality results of L2 are very similar to those of the first Lancet survey of Iraq (L1). This claimed equivalence between the two surveys is used to argue against the notion that there might have been substantial sampling bias in L2. Essentially, they claim that L1 and L2 used different sampling methods and yet arrived at similar numbers, concluding that L2 is not afflicted with any great sampling bias.2 In L2 they state:
“Application of the mortality rates reported here to the period of the 2004 survey gives an estimate of 112,000 (69,000-155,000) excess deaths in Iraq in that period. Thus, the data presented here validates our 2004 study, which conservatively estimated an excess mortality of nearly 100,000 as of September 2004.”
In a subsequent letter to Nature L2 authors Les Roberts and Gilbert Burnham state3:
“The first survey was done selecting random starting points with a Global Positioning System unit. The second used the random street-selection process, which is being criticized as biased. It rarely occurs in the field that two sampling methods are used allowing for comparison, and here the results are nearly identical.”
Summary: We argue that this claimed equivalence between L1 and L2 does not withstand scrutiny. The authors’ comparison conflates violent deaths with non-violent deaths. All suggestions of bias of which we are aware in the L2 methodology pertain to violent deaths only. Therefore, any meaningful comparison addressing potential bias in L2 must focus on, rather than obscure, the issue of violent deaths, which account for more than 90% of all excess deaths in the L2 study. Below, we provide a rough estimate comparing violent deaths in L1 and L2. It suggests upward bias of L2 relative to L1 by a factor of approximately 2. In other words, far from being “nearly identical”, the L1 and L2 data made available to some researchers indicate that the L2 study estimates twice as many violent excess deaths for the time period of the L1 study as L1 did. The L1 and L2 authors should release more complete data on both studies to enable a thorough analysis of the uncertainty that inevitably surrounds any such estimate. Released L1 and L2 data would need to distinguish violent from non-violent deaths as well as provide the dates of deaths. At the moment, we can not rule out a substantial bias factor.
We first point out that the confidence intervals around the estimates in both L1 (especially) and L2 are so wide that it is not promising from the start to argue for any proposition that the two Lancet studies are identical. With confidence intervals this wide, based on the total number of excess deaths, it is of course not possible to reject the null hypothesis that L1=L2 at a reasonable significance level. Yet, when a hypothesis test does not reject a hypothesis this does not mean that the hypothesis is indeed true. In this case, the error of the first kind, namely accepting the null hypothesis while it is false, would have enormous consequences. A mortality estimate which is far from the truth would discredit the study despite all the efforts that went in when collecting the data. The key point is that taking uncertainty into account will also render it impossible to rule out very large differences between the mortality results of L1 and L2. Here we argue that indeed L1 and L2 display large differences with respect to violent excess deaths, and therefore for a statistical analysis concerning violent excess deaths the studies should not be treated as equivalent.
The main problem with the comparison highlighted by the L2 authors is that it is of all excess deaths, not just violent deaths. All suggestions of possible bias in L2 that we know of, sampling or non-sampling, pertain to violent deaths. The available facts simply do not support a claim that L1 and L2 suggest very similar numbers of violent deaths. By persistently conflating non-violent deaths with violent deaths the L2 authors have obscured this essential point.4
Unfortunately, the authors of L1 have not released the data that would be necessary to allow researchers to make an estimate of violent deaths, let alone put a confidence interval around such an estimate. However, they did make public a central estimate of 57,600 violent deaths for all of Iraq excluding Anbar governorate.5 We hope that the authors of L1 will place the L1 dataset in the public domain so that the figure of 57,600 can be checked and a confidence interval placed around it. At a minimum the authors should provide their own confidence interval.6
The L2 paper does not provide the information required to calculate violent deaths in L2 outside of Anbar governorate during the L1 sampling period. However, the L2 authors have released a dataset to some researchers7 that is sufficiently detailed to enable a crude calculation.8 According to the L2 data:
A. There was 1 violent death during the pre-invasion period in the L2 dataset outside Anbar governorate. The number of violent deaths in L2 outside Anbar during the post-invasion period of L1 is between 51 and 57.9 This means that in-sample excess deaths are between 49.8 and 55.8.10
B. We take the sample size to be 11,428 outside of Anbar governorate. This is the figure in the L2 data for the sample size outside Anbar in the middle of the L2 sampling period.11
C. The population estimates by governorate in Table 1 of L2 indicate that the population of Iraq excluding Anbar governorate was 25,810,808 in the middle of 2004.
We can therefore estimate a range of violent excess deaths in Iraq excluding Anbar for L2 during the L1 sampling period of 112,500 – 126,000.12 These figures are 2.0 -2.2 times the publicized central estimate of 57,600 violent excess deaths in L1 outside Anbar. Access to the L2 data would enable us to improve upon this estimate and place confidence intervals around it and we encourage other researchers with the data to do so.
Note also that these figures, which are solely for violent deaths, and which exclude Anbar governorate, are equal to or exceed the figure of 112,000 given in L2 for total excess deaths, i.e. violent deaths plus non-violent deaths, for the whole of Iraq, including Anbar, for L2 during the L1 sample period.13
Calculating as before but now including Anbar data, we estimate violent excess deaths in L2 during the L1 sampling period to be between 139,400 and 157,000 violent deaths.14
Based on the confidence intervals already given in L1 and L2 it is clear that these confidence intervals will be quite wide and will almost certainly overlap, despite the widely divergent point estimates, so that we cannot rule out the possibility that the violent excess deaths in the two studies are the same. But at the same time there will almost certainly be substantial probability of L2/L1 ratios exceeding 2. We also cannot rule out the possibility that L1 itself might have contained upward bias.
Separately we consider non-violent deaths in the two studies and estimate non-violent excess deaths in L2 during the L1 sample period of between negative 46,000 and negative 19,700.15 L1 estimates positive 40,400 non-violent excess deaths outside of Anbar.16 This strong divergence between L1 and L2 with non-violent deaths also contradicts the notion that the results of the two studies are “nearly identical”. To the contrary, L2 measures far more violent deaths than L1 does. On the other hand, L1 measures a positive and substantial number of excess non-violent deaths while L2 implausibly measures a substantial number of non-violent deaths avoided due to the war. The L2 authors subtract off non-violent deaths avoided due to the war from violent deaths caused by the war (both according to the L2 data) with the outcome that two studies seem superficially similar.
The claim of Les Roberts and Gilbert Burnham that L1 and L2 have given “nearly identical” results is based upon conflating non-violent deaths with violent ones and, as far as they report, ignoring confidence intervals. It is simply irresponsible to rule out any sampling bias despite the fact that the number of violent deaths in the two studies points in the direction of bias. We have presented very plausible arguments suggesting that there might indeed be quite a substantial difference between the mortality estimates of L1 and L2. Such a difference could then be explained by main-street bias.
2At best a comparison of the two studies might suggest that L2 is not biased relative to L1. It cannot eliminate the possibility that both studies are biased. Thus, the comparison proposed by the authors logically cannot deliver the promised result.