Multiple Systems Estimation: Does it Really Work?

The Kosovo Memory Book

The Kosovo Memory Book

<< Previous post, MSE: Stratification and Estimation

Q15. Are there other MSE models one might use with human rights data?

Q16. Is it possible to use MSE to model non-lethal human rights violations?

Q17. I am concerned about using MSE with my data, because the datasets were gathered by opposing organizations. Victims who were reported to an NGO were very unlikely to be reported to state sources, but also very likely to be reported to religious organizations. Won’t that cause the overlaps between the NGO list and the state list to be artificially low, and the overlaps between the NGO list and the church list to be artificially high? Does this mean we can’t use the data for MSE?

Q18. Can you show me proof that MSE actually works?

Q15. Are there other MSE models one might use with human rights data?

Yes. In one paper, our team used  a fully Bayesian approach that, while analogous to log-linear modeling in some ways, relies on graphical models for the calculation of estimates.

In addition, other estimators for the number of uncounted cases exist. Zwierzchowski and Tabeau (2010) employed an “imperfect matching” approach to MSE with log-linear modeling, using pre-war census records from Bosnia to validate cases with conflicting information across sources. And, in a 2003 investigation of the Srebrenica massacre, Brunborg et al. used a dual-system estimator after finding that multiple data systems nearly perfectly overlapped one another. In this particular case, a dual-systems estimate was appropriate because the error introduced by failing to model dependencies was minimal, compared to the overall size of the estimate; however, in most cases (as we discussed above) dual systems estimates are inappropriate because dependencies between data systems are relevant to the final estimate.

In her doctoral dissertation, HRDAG consultant Amelia Hoover Green (2011) used Chao’s (1989) method of moments, as implemented for the R statistical software by Baillargeon and Rivest (2007). The essential difference between Chao’s method and the log-linear method more frequently used by HRDAG lies in what, exactly, is parameterized. Chao’s method models capture heterogeneity, but not list dependence. Using Chao’s method of moments assumes that most list dependences result from capture heterogeneity and therefore will be captured by a model that considers only capture heterogeneity. In addition, Chao’s method is best suited to sparse data most of which have been captured in only 1 or 2 systems. Finally, it is important to note that Chao’s method represents a lower bound for the population being estimated.

Q16. Is it possible to use MSE to model non-lethal human rights violations?

Maybe. Because non-lethal violations may occur more than once to the same victim, these violations cause extreme difficulty in accurately determining overlap. In addition, non-lethal violations are typically reported at much lower and more variable rates than are lethal violations. In a best-case scenario, it may be possible to determine the prevalence of a non-lethal human rights violation (i.e., the number of victims, regardless of the number of times each victim has suffered the violation); however, it would be very difficult to determine incidence (i.e., the approximate number of violations per person).

Q17. I am concerned about using MSE with my data, because the datasets were gathered by opposing organizations. Victims who were reported to an NGO were very unlikely to be reported to state sources, but also very likely to be reported to religious organizations. Won’t that cause the overlaps between the NGO list and the state list to be artificially low, and the overlaps between the NGO list and the church list to be artificially high? Does this mean we can’t use the data for MSE?

If you had access to only two datasets, these types of correlations would indeed be very problematic. That is because the lists you describe are dependent on one another. If we performed a two-system estimate with any two of these systems, we would not be likely to recover the correct value. However, this does not mean that we cannot use these data for MSE, because there are more than two systems available. Recall that, with three or more systems, instead of assuming that our lists are independent, we can instead model dependencies between lists, as discussed above.

Q18. Can you show me proof that MSE actually works?

MSE was initially developed in the context of wildlife population management, where it is known as capture-recapture, mark-recapture, or multiple-recapture analysis. It has a long intellectual pedigree, beginning with reports on fish migration by Petersen (Petersen, C. G. J., “The Yearly Immigration of Young Plaice Into the Limfjord from the German Sea,” Report of the Danish Biological Station (1895), 6:5–84, 1896.). Sekar and Deming (1949) expanded the uses of capture-recapture to human populations. MSE is used by the U.S. Census Bureau to more accurately estimate the U.S. population; it is also used frequently in epidemiology to determine the completeness of disease registers, and is the subject of a large statistical literature.

In addition to its extensive validation in the statistical and demographic literature, MSE has frequently been used to determine the size of rare and stigmatized human populations. Populations measured by MSE include lesbians in central Pennsylvania [1], intravenous drug users [2], and those infected with HIV [3]. Given its strong pedigree and its frequent use with rare and stigmatized human populations, we are convinced that MSE models, correctly chosen and applied, produce significantly more accurate estimates than “competing” methods such as convenience data or surveys.

In addition, our earliest analyses have now been verified by later data collection. For example, in a 2002 analysis, Patrick Ball and co-authors estimated that approximately 10,000 Kosovar Albanian civilians were killed during March–June 1999 (10,356, 95% confidence interval 9,002 – 12,122). They reached this estimate by applying the techniques described above to only 4,400 documented deaths. That is, well over half the deaths that occurred in Kosovo remained undocumented by any source three years after the close of the conflict.

This analysis was controversial at the time: critics doubted that seemingly exhaustive efforts to record civilian deaths could have been so incomplete. At the International Criminal Tribunal for the Former Yugoslavia, defense attorneys for Serbian government figures accused of crimes in Kosovo claimed that the estimates could not be correct and were not scientific. However, the bulk of the statistical record shows that these estimates were correct. A survey-based estimate, published in 2000, suggested that 12,000 individuals (5,500-18,300) were killed in Kosovo between January 1998 and September 1999, the vast majority of them between March and June 1999.

Similarly, after well over a decade of record collection, the Humanitarian Law Centre Kosovo released in 2011 the Kosovo Memory Book (KMB), an attempt to document all violent deaths and disappearances in Kosovo in 1998–1999. The number of deaths reported in the KMB, like the survey analysis by Spiegel and Salama, closely accords with our estimates. Not only does the KMB report include approximately the same number of deaths as our 2002 estimate, the distribution of the KMB’s documented deaths over time and space closely follows the distribution in our estimate. KMB data record about 14,000 deaths and disappearances between January 1998 and December 1999, with the majority of these occurring in the time period for which we conducted estimates (March–June 1999). (N.B. Kosovo Memory Book records include some military personnel, which our estimate did not.)

Given these results, as well as the vast statistical literature related to this method, we believe that MSE is well-validated, and that it is as appropriate for mortality data as for any type of data (fish, rabbits, drug users, diabetes sufferers, and on and on). However, there are some forms of human rights data, such as data on non-lethal violence, that may or may not support MSE analyses. Our team continues to research other applications and refinements of MSE, both by implementing new models and by complex simulation exercises.

[1] Aaron, DJ et al. 2003. “Estimating the lesbian population: a capture-recapture approach.” Journal of Epidemiology and Community Health 57(3): 207.

[2] Mastro, T.D. et al. 1994. “Estimating the number of HIV-infected injection drug users in Bangkok: a capture–recapture method.” American Journal of Public Health 84(7): 1094.

[3] Abeni DA, Brancato G, Perucci CA (1994). “Capture-Recapture to Estimate the Size of the Population with Human Immunodeficiency Virus type 1 Infection.” Epidemiology, 5, 410–414.

[Creative Commons BY-NC-SA, excluding image]