Dictatorships create a lot of data
Structural Zero Issue 01
Part One of Our Three Part “Gathering the Data” Series
As a statistician, I spend most days trying to wrangle and analyze massive data sets. The specific data I deal with is documentation of human rights violations. My job is to make sense of data that I know is incomplete and answer questions about the past using statistical analysis and scientific reasoning.
But where does this data come from? How was it generated, and how do human rights advocates and researchers access it and secure it?
To kick off our new newsletter Structural Zero, I’ll be writing a series of articles I’m calling Gathering the Data, which will explore how this important data is created, collected, and securely stored before it is analyzed. It will shed light on how tyrannical governments work in ways that might surprise you and will serve as a foundation before we launch into posts about how to analyze data.
Different Types of Data
When my colleagues and I work in post-conflict countries like Guatemala, El Salvador, and Chad, we are typically (though not exclusively) dealing with three types of data:
- Interviews
Our local partners, often affiliated with nonprofits, truth commissions, or religious institutions, conduct extensive interviews of survivors and witnesses. The interview subject may share harrowing details about their experiences losing their homes, being detained, or being tortured. They may also describe losing friends, family members, and neighbors. Interviewers will take detailed notes. The goal is to capture details about what happened, as well as when and where it happened. If it’s possible, they’ll note the perpetrator and the identity of the victim(s).Interviews are the most important data we analyze, and the lynchpin in our understanding of these major human rights events. - Physical evidence
When we look for physical evidence, we are often looking for objects in the real world that can corroborate our understanding of violence that occurred in the past. This often includes searching for graves of people who were killed during a conflict, including graves that were purposefully hidden, as we saw in Mexico and Colombia. Physical evidence could also be prisons, weapons and ammunition stores, instruments of torture, or any other object that sheds light on what we think we know about the past. - Written documentation
Documentation is paper or digital reports that can include autopsy reports, photographs, morgue data, evidence gathered by local churches, diary entries from victims, and much more. Importantly, it also includes orders given by government officials to those beneath them as well as the reports written by those executing those orders, sent back up the chain of command.
All of these data types, even the physical evidence, must be rendered in a digital format for us to begin our analysis
The Bureaucracy of Dictatorship that Solves the Principal-Agent Problem
The idea of written documentation might come as a surprise to those who haven’t worked in this area before. After all, one might wonder, wouldn’t a dictator who is committing human rights violations go to great lengths to hide their actions?
It turns out they don’t. Dictatorships require a lot of paperwork.
When a dictator delegates the power to commit violence to police and paramilitary groups, he needs a way to make sure that his agents do the violence he wants, rather than doing violence that benefits only the agents themselves. The agents may want to use their power, at least occasionally, to commit reprisals against enemies or to extort resources from vulnerable people and businesses. There’s a good reason that dictatorships tend to become corrupt over time: the police and paramilitary groups charged with maintaining social control through violence—without rule-based accountability—can also use their violence to create organized criminal networks.
The main mechanism to assure that subordinates follow orders is bureaucracy, which is all about creating a written record of the relationship between bosses and workers. The organizational leadership creates policies or strategies, then the next staff layer creates plans to implement the policies. The next middle-management layer below that writes orders to particular units and then agents to execute the plans.
When the agents complete their orders, they write reports documenting what they did. The reports then circle back up the chain as proof that the orders fulfilled the plan.
German sociologist Max Weber, who studied social structures in the early 1900s, wrote extensively about the nature of bureaucracy. He argued that bureaucracy was both an efficient mechanism for maintaining institutional control and could pose a danger to individual liberties.1 His work laid the philosophical foundation for social scientists over the next 100 years who developed what is known as the principal-agent problem. This is the idea that the person who owns a project (the principal) delegates actions to someone who is supposed to act on their behalf (an agent), but the agent’s actions may not align with the goals and intentions of the principal. Instead, agents may act in their own self-interest or may simply get off-course due to poor communication through the chain of command.
Bureaucracies use paperwork to try to reel in the principal-agent problem. Written commands attempt to provide clear instructions to agents, while detailed reports offer insight into what those agents were doing. The more paperwork created, the tighter the control of the commanders further up the chain. The more people involved in executing a principal’s vision, the more bureaucratic process and paperwork is necessary to ensure things don’t go off-course.
Dictators use these same mechanisms of bureaucratic control to violate human rights on a mass scale. Orders given to national police and military units ensure the dictator’s intentions are understood down the chain, and then the officers send back reports describing those violations in detail.
Warehouses of Paperwork
I saw a chilling example of this type of documentation when I visited Guatemala in 2005.
In 1996, as part of the peace accords in Guatemala, the National Police of Guatemala was disbanded and replaced, and all their files were put into storage. Disbanding the police was necessary because the National Police had been deeply implicated in disappearances and killings of civilians over many decades. I arrived and was taken to one of the largest data troves I had dealt with: four massive, two-story warehouses filled with paperwork created by the National Police of Guatemala, detailing about 100 years of their activities.
The documents were in cardboard boxes, filing cabinets, and sacks of paper, all stacked one on top of another. After years in storage the paperwork was moldering, fetid, and infested with rats. I used mapping tools and statistical analysis to estimate the total number of documents held in the four giant warehouses before we began the process of sampling, scanning, and analyzing documents. In total, I calculated that there were about 80 million pieces of paper.
This massive trove of data was written documentary evidence. As I later testified in the trial of former head of the national police Col. Hectór Bol de la Cruz, we uncovered in those papers evidence that the National Police of Guatemala that conducted human rights violations against the Guatemalan people followed the typical bureaucratic processes at the time. They were not rogue agents but conduits for the intentions of the superiors. We also found the reports written by police officers, explaining their activities to their commanding officers.
Wrapped in dry administrative language and militaristic jargon, these documents showed that the officers perpetrating the disappearance were following routine procedures, receiving commands and submitting reports. Other documents in the Archive showed communications between the police and military, under de facto President of Guatemala Gen. Efraín Ríos Montt. Montt was convicted of committing acts of genocide, and I contributed testimony in his case based on analysis of truth commission testimonies and other documentation efforts. [Updated 7/3/25: These two paragraphs have been updated to clarify which cases I testified in and which documents were involved in those cases.]
These piles of papers were a form of bureaucratic control, a grim effort to counter the principal-agent problem.

Today, dictators don’t keep as many physical records. Instead of warehouses full of paper, modern dictators turn to various forms of digital communication to instruct their agents and keep tabs on things. This makes it potentially easier for future data scientists to analyze their actions, but digital data can also be fragile and can be rendered inaccessible to researchers in many ways.
But it is not the case, as one might assume, that human rights violations are well-hidden and poorly documented. In many cases, there are massive amounts of data being created in real time that connect criminal activities to the government agents that ordered them, including those at the very top of government.
This is part one of Gathering Data, a series exploring how Human Rights Data Analysis Group scientists and our partners are able to collect and store information about human rights violations. Subscribe today to get our next installment.
And as always, thank you for supporting HRDAG’s work to bring scientific rigor to human rights works.
-PB
This article was written by Patrick Ball, Director of Research for the Human Rights Data Analysis Group (HRDAG), a nonprofit organization using scientific data analysis to shed light on human rights violations. You can also follow us on Bluesky, Mastodon, and LinkedIn.
Structural Zero is a free monthly newsletter that helps explore what scientific and mathematical concepts teach us about the past and the present. Appropriate for scientists as well as anyone who is curious about how statistics can help us understand the world, Structural Zero is written by 5 data scientists and edited by Rainey Reitman.
If you get value out of these articles, please support us by subscribing and telling your friends about the newsletter.
Footnote 1: See, for example: Weber, Max Weber’s Economy and Society: An Outline of Interpretive Sociology. (1978)