Home - Conflict Mortality Surveys|Bias Paper|Follow-up Paper|Visual Summary|FAQs|L1 versus L2|Methods

Urban Homicide Rates Around the World|Iraq Maps|Control and Danger|News|Clarifications|Presentations|About Us|Contact Us


 

          

   

The Two Lancet Surveys of Iraq Do Not Validate Each Other1

Les Roberts and Gilbert Burnham, authors of the second Lancet mortality survey of Iraq (L2), have asserted on a number of occasions that the mortality results of L2 are very similar to those of the first Lancet survey of Iraq (L1).  This claimed equivalence between the two surveys is used to argue against the notion that there might have been substantial sampling bias in L2.  Essentially, they claim that L1 and L2 used different sampling methods and yet arrived at similar numbers, concluding that L2 is not afflicted with any great sampling bias.2  In L2 they state:

“Application of the mortality rates reported here to the period of the 2004 survey gives an estimate of 112,000 (69,000-155,000) excess deaths in Iraq in that period.  Thus, the data presented here validates our 2004 study, which conservatively estimated an excess mortality of nearly 100,000 as of September 2004.”

In a subsequent letter to Nature L2 authors Les Roberts and Gilbert Burnham state3:

“The first survey was done selecting random starting points with a Global Positioning System unit. The second used the random street-selection process, which is being criticized as biased. It rarely occurs in the field that two sampling methods are used allowing for comparison, and here the results are nearly identical.”

Summary: We argue that this claimed equivalence between L1 and L2 does not withstand scrutiny.  The authors’ comparison conflates violent deaths with non-violent deaths.  All suggestions of bias of which we are aware in the L2 methodology pertain to violent deaths only. Therefore, any meaningful comparison addressing potential bias in L2 must focus on, rather than obscure, the issue of violent deaths, which account for more than 90% of all excess deaths in the L2 study.  Below, we provide a rough estimate comparing violent deaths in L1 and L2.  It suggests upward bias of L2 relative to L1 by a factor of approximately 2.  In other words, far from being “nearly identical”, the L1 and L2 data made available to some researchers indicate that the L2 study estimates twice as many violent excess deaths for the time period of the L1 study as L1 did.  The L1 and L2 authors should release more complete data on both studies to enable a thorough analysis of the uncertainty that inevitably surrounds any such estimate.  Released L1 and L2 data would need to distinguish violent from non-violent deaths as well as provide the dates of deaths.  At the moment, we can not rule out a substantial bias factor.

We first point out that the confidence intervals around the estimates in both L1 (especially) and L2 are so wide that it is not promising from the start to argue for any proposition that the two Lancet studies are identical.  With confidence intervals this wide, based on the total number of excess deaths, it is of course not possible to reject the null hypothesis that L1=L2 at a reasonable significance level. Yet, when a hypothesis test does not reject a hypothesis this does not mean that the hypothesis is indeed true. In this case, the error of the first kind, namely accepting the null hypothesis while it is false, would have enormous consequences. A mortality estimate which is far from the truth would discredit the study despite all the efforts that went in when collecting the data. The key point is that taking uncertainty into account will also render it impossible to rule out very large differences between the mortality results of L1 and L2. Here we argue that indeed L1 and L2 display large differences with respect to violent excess deaths, and therefore for a statistical analysis concerning violent excess deaths the studies should not be treated as equivalent.

The main problem with the comparison highlighted by the L2 authors is that it is of all excess deaths, not just violent deaths.  All suggestions of possible bias in L2 that we know of, sampling or non-sampling, pertain to violent deaths.  The available facts simply do not support a claim that L1 and L2 suggest very similar numbers of violent deaths.  By persistently conflating non-violent deaths with violent deaths the L2 authors have obscured this essential point.4

Unfortunately, the authors of L1 have not released the data that would be necessary to allow researchers to make an estimate of violent deaths, let alone put a confidence interval around such an estimate.  However, they did make public a central estimate of 57,600 violent deaths for all of Iraq excluding Anbar governorate.5  We hope that the authors of L1 will place the L1 dataset in the public domain so that the figure of 57,600 can be checked and a confidence interval placed around it.  At a minimum the authors should provide their own confidence interval.6

The L2 paper does not provide the information required to calculate violent deaths in L2 outside of Anbar governorate during the L1 sampling period.  However, the L2 authors have released a dataset to some researchers7 that is sufficiently detailed to enable a crude calculation.8 According to the L2 data:

A.  There was 1 violent death during the pre-invasion period in the L2 dataset outside Anbar governorate.  The number of violent deaths in L2 outside Anbar during the post-invasion period of L1 is between 51 and 57.9  This means that in-sample excess deaths are between 49.8 and 55.8.10

B.  We take the sample size to be 11,428 outside of Anbar governorate.  This is the figure in the L2 data for the sample size outside Anbar in the middle of the L2 sampling period.11 

C.  The population estimates by governorate in Table 1 of L2 indicate that the population of Iraq excluding Anbar governorate was 25,810,808 in the middle of 2004.   

We can therefore estimate a range of violent excess deaths in Iraq excluding Anbar for L2 during the L1 sampling period of 112,500 – 126,000.12  These figures are 2.0 -2.2 times the publicized central estimate of 57,600 violent excess deaths in L1 outside Anbar.  Access to the L2 data would enable us to improve upon this estimate and place confidence intervals around it and we encourage other researchers with the data to do so.

Note also that these figures, which are solely for violent deaths, and which exclude Anbar governorate, are equal to or exceed the figure of 112,000 given in L2 for total excess deaths, i.e. violent deaths plus non-violent deaths, for the whole of Iraq, including Anbar, for L2 during the L1 sample period.13  

Calculating as before but now including Anbar data, we estimate violent excess deaths in L2 during the L1 sampling period to be between 139,400 and 157,000 violent deaths.14 

Based on the confidence intervals already given in L1 and L2 it is clear that these confidence intervals will be quite wide and will almost certainly overlap, despite the widely divergent point estimates, so that we cannot rule out the possibility that the violent excess deaths in the two studies are the same.  But at the same time there will almost certainly be substantial probability of L2/L1 ratios exceeding 2.  We also cannot rule out the possibility that L1 itself might have contained upward bias.

Separately we consider non-violent deaths in the two studies and estimate non-violent excess deaths in L2 during the L1 sample period of between negative 46,000 and negative 19,700.15  L1 estimates positive 40,400 non-violent excess deaths outside of Anbar.16 This strong divergence between L1 and L2 with non-violent deaths also contradicts the notion that the results of the two studies are “nearly identical”.  To the contrary, L2 measures far more violent deaths than L1 does.  On the other hand, L1 measures a positive and substantial number of excess non-violent deaths while L2 implausibly measures a substantial number of non-violent deaths avoided due to the war.  The L2 authors subtract off non-violent deaths avoided due to the war from violent deaths caused by the war (both according to the L2 data) with the outcome that two studies seem superficially similar.

The claim of Les Roberts and Gilbert Burnham that L1 and L2 have given “nearly identical” results is based upon conflating non-violent deaths with violent ones and, as far as they report,  ignoring confidence intervals.  It is simply irresponsible to rule out any sampling bias despite the fact that the number of violent deaths in the two studies points in the direction of bias. We have presented very plausible arguments suggesting that there might indeed be quite a substantial difference between the mortality estimates of L1 and L2.  Such a difference could then be explained by main-street bias.

1We thank Josh Dougherty of the Iraq Body Count Project for many helpful discussions on this piece.

2At best a comparison of the two studies might suggest that L2 is not biased relative to L1.  It cannot eliminate the possibility that both studies are biased.  Thus, the comparison proposed by the authors logically cannot deliver the promised result.

3See Les Roberts & Gilbert Burnham “Authors defend study that shows high Iraqi death toll” Nature 446, 611 (5 April 2007).
4In fact, in the central estimates used in this comparison by the L2 authors, about 92% of estimated excess deaths in L2 are violent whereas only about 59% of estimated excess deaths in L1 are violent.  This alone already suggests quite a strong divergence between these two estimates.
5Anbar was excluded by the L1 authors from this calculation because it was an extreme outlier in L1.
6The 95% confidence interval provided in L1 for all excess deaths is 8,000 – 194,000 with a central estimate of 98,000.  Just scaling down this interval would give an interval for violent deaths of 4,700 to 114,000.  But this is not a proper way to derive this confidence interval.
7The L2 authors refused to release the data to our team.
8Some estimation based on the L2 dataset is possible although some researchers who possess it argue that it is inadequate for really proper estimation.  See, for example,  Seppo Laaksonen, Retrospective Two-Stage Cluster Sampling for Mortality in Iraq, 2007, unpublished manuscript, University of Helsinki.
9Private communication with David Kane of Harvard’s Institute of Quantitative Social Science.  Kane has access to the released data from L2 and, although he does not have permission to share the underlying data with our group (or anyone else), he is allowed to answer questions which do not involve revealing the details of a specific death or household.  We thank him for taking the time to answer our questions.  There are 5 violent deaths in the L2 dataset that were outside Anbar and occurred in September of 2004.  No precise dates are attached to these deaths so we cannot definitely determine whether or not they were within the L1 sampling period that ended September 20, 2004.  In addition there is one violent death outside Anbar that is listed as occurring in 2004 but does not even have a month attached to it and, therefore, may or may not have occurred before September 20, 2004.
10Here we essentially subtract the one pre-invasion death from the post-invasion deaths to get excess deaths.  This is meant to be a measure of how many deaths occurred during the war that would not have occurred if the war had never started in the first place.  The only nuance is that the 1 death is scaled up by a factor of (17.8)/ (14.8) to reflect the fact that the post-invasion period is slightly longer then the pre-invasion period.
11Private communication with David Kane.
12(49.8/11,428) x 25,810,808 =112,476 and (55.8/11,428) x 25,810,808=126,028 .
13It is not clear exactly how the L2 authors arrived at their figure of 112,000, or produced a confidence interval for it.  In the L2 dataset there are 7 deaths in 2004 without months attached, so there is no way to tell if they fall within the L1 time-frame or outside of it.  There are also 13 deaths in September 2004 in L2 which complicate matters as well.  There is simply not enough information to determine whether these 20 deaths should be included or not in the comparison. (Private communication with David Kane).
14 The L2 data records 1 violent death and 5 nonviolent deaths in Anbar governorate during the pre-invasion period and 17 violent and 3 nonviolent deaths in Anbar during the post-invasion period to the end of September 2004 with two of these violent deaths in September of 2004.  The mid-period in-sample size was 12,380 (Private communication with David Kane).  L2 gives the population of Iraq as 27,139,584.  The calculations are (63.6/12,380) x 27,139,584 = 139,425 and (71.6/12,380) x 27,139,584 = 156,962 where pre-war violent deaths of 2x (17.8)/ (14.8) have been subtracted from postwar violent deaths to yield the numbers 63.6 and 71.6. 
15 For nonviolent deaths the L2 dataset contains 80 pre-invasion and 75-87 post-invasion and within the L1 sample period (private communication with David Kane).  Non-violent excess deaths are, therefore, between -21 and -9.
16 We get 40,400 non-violent excess deaths simply by subtracting the 57,600 violent excess deaths from the central estimate of 98,000 total excess deaths in L2.  However, the breakdown for excess deaths given by L1 co-author Richard Garfield gives only 37,600 non-violent excess deaths and, therefore, violent plus non-violent excess deaths only add up to 95,200 according to his figures.  Reducing the estimate of non-violent deaths in L1 to 37,600 would not change our analysis.