The Names that Don’t Match
Structural Zero 11: A new database names 10,000+ lives lost during one part of the Sri Lankan civil war
The grimmest moments of the Sri Lankan civil war have been etched in the world’s memories: during the final months, Tamils were pushed farther and farther north until hundreds of thousands of people were trapped on a narrow strip of coastline near Mullivaikkal.
The Sri Lankan government declared parts of the area “No Fire Zones,” telling civilians they would be safe there. Instead, those zones were repeatedly shelled. Hospitals and makeshift medical sites were hit. Families dug shallow bunkers into the sand and sheltered under tarps while artillery landed around them. Food and medicine became scarce. People trying to flee were trapped between advancing government forces and the collapsing territory held by the Liberation Tigers of Tamil Eelam (LTTE).
In May 2009, the war ended on that beach. This month marks the 17th anniversary of the thousands of civilians killed in the “No Fire Zones.”
Seeking the truth about what happened in Sri Lanka is an ongoing effort. Casualty estimates vary by tens of thousands, the Sri Lankan government continues to reject allegations of large-scale killings, and families of the disappeared are still fighting for accountability. HRDAG’s analysis of the data shows that around 500 people were disappeared after surrendering to the army and then being loaded onto buses over the course of three days at the end of the war, but they have still not been acknowledged by the Sri Lankan government. One army insider who witnessed the killings during that time said later: “I saw this shooting of surrenderees take place a number of times. A number of groups, some 50, some 75, some more than 25 would come forward and they would all be killed. That included children, small children, women and old people…This was widespread killing. If journalists were around then the civilians were allowed to surrender, but when the journalists were not around the orders were to kill everyone.”

Many bodies were never recovered and entire families disappeared completely. Indeed, to this day investigators have been denied access to the Mullivaikkal beach to review the physical evidence.
That uncertainty—the gap between what survivors know happened and what can be formally documented—is often the focus of my work at the Human Rights Data Analysis Group. I spend a lot of time trying to answer deceptively simple questions: How many people were killed? Who disappeared? Which records refer to the same person? What evidence can withstand scrutiny years later?
For the Sri Lanka project, those questions turned out to be extraordinarily difficult, and I’ve spent more than 8 years wrestling with them.
A new database to shine a light on Sri Lanka
There is no single authoritative list of the dead and disappeared in Sri Lanka. Instead, there are fragments: NGO reports, survivor testimony, newspaper archives, memorial websites, legal filings, local documentation projects, and handwritten records maintained by diaspora organizations.
So the International Truth and Justice Project (ITJP) and HRDAG began trying to assemble them. We have been working since 2018 to collect the names of the dead and disappeared throughout the entire conflict
We focused our efforts on the Indian Peace Keeping Force period of the conflict which lead to the creation of the LKDD (Sri Lanka Dead and Disappeared), a public database documenting victims from October 1987 to March 1990, within the Northern and Eastern provinces. Our hope is to expand the database to cover more of the war over time.
Unlike many human rights projects, where partners hand us relatively standardized datasets, we received lists piece by piece and I spent significant time searching the web for published lists and news stories for names of the dead and disappeared. Ongoing testimony and witness information is still being gathered from the diaspora to collect as much information as possible to name victims and corroborate other reports.
Some groups were reluctant to share records with us at all. Survivors and witnesses still fear retaliation, and many organizations are deeply protective of testimony they spent years collecting. There is also the trauma witnesses suffer in sharing the information about the violence. We have worked with ITJP to help groups use technology to scan and securely store their data on encrypted servers to preserve the stories of the violence.
Record linkage—a key tool for handling overlapping datasets
My primary role is record linkage: determining when multiple records refer to the same person. For Sri Lanka, much of this had to be done manually. I worked across databases standardizing names, dates, villages, and ages so that datasets could be compared.
This is particularly challenging in Sri Lanka because the two national languages, Sinhalese and Tamil, can result in many variations in spelling, especially when names are transcribed into English. Additionally, names may have other variations: one source may identify someone by a militant nickname, another by a partial name, another by a family relationship.
Dates are often uncertain because many families fled violence and reported disappearances years later. Locations shifted because civilians were repeatedly displaced.
To ensure the highest quality records, I sort records repeatedly: by first name, last name, village, age, and date. I create representative records for likely matches and keep refining them over multiple passes.
Over time, patterns begin to emerge.
One thing I’ve learned after years working on human rights data is that authentic records tend to overlap. Families speak to multiple organizations. Witnesses talk to neighbors, journalists, NGOs, churches, and memorial groups. Real violence creates intersections across systems.
Fake data usually does not.
I sometimes describe this as a “smell test.” In Sierra Leone years ago, I flagged a set of records that felt wrong before I even knew where they came from. The ages were suspiciously rounded. The narratives repeated themselves mechanically.
Eventually we discovered the interviewer had fabricated interviews that never occurred and we removed that data from our research. I’ve gotten used to spotting those anomalies because real data has texture.
That matters because projects like LKDD will inevitably face political scrutiny. Governments often attack casualty databases by pointing to duplicate names or inconsistencies—especially possible in situations like Sri Lanka, where the government continues to deny the scale of the atrocities that occurred.
To ensure scientific rigor, I match conservatively. If there’s a possibility that two records refer to the same person, I combine them unless there is strong evidence they are separate victims. I would rather underestimate than accidentally inflate the numbers. As family and friends look through the database, they can submit additional information that can help strengthen the quality of the data. This may lead to records that were previously linked being separated as we are given more details like dates of birth and more complete names.
Today, the new LKDD database is publicly available for anyone to view. It lists over 10,000 names of victims. This initial launch covers the time period during which the Indian Peace Keeping Force was deployed to Sri Lanka as part of an agreement with the Sri Lankan government, which sought India’s help in disarming Tamil militants in the north east of the island. During this period, there were widespread allegations of human rights violations, including multiple massacres, unlawful killings, and enforced disappearances committed by the Indian soldiers. While not all the victims listed in the database can conclusively be proven to be victims of the Indian forces, the dataset reflects a pattern of grave atrocities impacting both civilians and military fighters.

Map by NordNordWest, Lizenz: CC BY-SA 3.0, Modified by David Peters, https://commons.wikimedia.org/w/index.php?curid=48972517
Bringing answers to the families
For many families, the new LKDD database offers a way to make sense of thousands of lives lost and to remember those who were lost. While the Sri Lankan government has torn down memorials to the dead and disappeared inside the country, this list attests those who fell victim to the violence. That acknowledgment itself matters because so many of the stories of what happened in Sri Lanka have been lost. This helps shed light on one particularly deadly period of the conflict.
Documentation cannot erase grief or bring back the families that disappeared. But it can create a historical record that tells the truth and offers consolation to the survivors.
Working to provide data-backed answers to Sri Lankan families motivates my work and helps me continue through meticulous data linkage projects year after year.
My work with Sri Lanka is far from over, but I’m so proud of our collaboration with ITJP. It means we are one step closer to the world knowing the truth of what really happened in Sri Lanka. .
MD
Michelle Dukich
This article was written by Michelle Dukich, data processing and record linkage analyst for the Human Rights Data Analysis Group (HRDAG), a nonprofit organization using scientific data analysis to shed light on human rights violations.
Structural Zero is a free monthly newsletter that helps explore what scientific and mathematical concepts teach us about the past and the present. Appropriate for scientists as well as anyone who is curious about how statistics can help us understand the world, Structural Zero is edited by Rainey Reitman and written by 4 data scientists who use their skills in support of human rights. Subscribe today to get our next installment. You can also follow us on Bluesky, Mastodon, LinkedIn, and Threads.
If you get value out of these articles, please support us by subscribing, telling your friends about the newsletter, and recommending Structural Zero to others.


