2 results for month: 01/2016


Data Archaeology for Human Rights in Central America: HRDAG Collaborates with UWCHR

Patrick Ball is kicking himself for a decision he made almost 25 years ago. “I was clever, but I wasn’t smart,” he says ruefully, as he considers the labyrinth of tables and ASCII-encoded keystrings he used to design a database of human rights violations for the pioneering Salvadoran non-governmental Human Rights Commission (CDHES). Now I’m sitting in his office in San Francisco’s Mission District watching over his shoulder, and trying to keep up, as he bangs out code to decipher the priceless data contained in these old files. Created in 1991 and 1992, during the last days of El Salvador’s internal armed conflict, the files detail ...

A geeky deep-dive: database deduplication to identify victims of human rights violations

In our work, we merge many databases to figure out how many people have been killed in violent conflict. Merging is a lot harder than you might think. Many of the database records refer to the same people--the records are duplicated. We want to identify and link all the records that refer to the same victims so that each victim is counted only once, and so that we can use the structure of overlapping records to do multiple systems estimation. Merging records that refer to the same person is called entity resolution, database deduplication, or record linkage. For definitive overviews of the field, see Scheuren, Herzog, and Winkler, Data Quality ...