Datasets available for research

Over the last few years, we’ve tried to make the data organized in our projects publicly accessible. We have encouraged our partners to publish the data at the completion of the project. We continue to believe it is important to offer access to the data used in our projects for the sake of transparency as well as to encourage further research and analysis. However, we are increasingly concerned about how raw data are used. Data collected by what we can observe is what statisticians call a convenience sample, which is subject to selection bias.

We’re keeping these datasets available for researchers who want to use them for simulation or estimation projects that use statistical models to correct for biases. We strongly caution researchers not to use the raw data as measures of violence. (Note: several of the Kosovo datasets are estimates and can be used as measures; read the data dictionaries carefully.) Please contact us at info @ if you have any questions or need assistance.

You are welcome to use these datasets for your research. If you publish with them, however, we ask that you include the following text:

“These are convenience sample data, and as such they are not a statistically representative sample of events in the referenced conflict. These data do not support conclusions about patterns, trends, or other substantive comparisons (such as over time, space, ethnicity, age, etc.).”

We would also very much appreciate a citation.

Statistical data

Our work has been used by truth commissions, international criminal tribunals, and non-governmental human rights organizations. We have worked with partners on projects on five continents.