2.2.2 One victim = One violation = One perpetrator
When human rights organizations begin to represent information about human rights violations, very often people begin by focusing on the victim: what was she or he a victim of? Thus the first step can often be to think of each victim having suffered one particular kind of violation. Thus the organization can create a database that looks like the following table:
|Smith||John Sr.||Arb. Exec.||Nat.Guard||Littleton||26.06.94|
In this hypothetical example, John Smith is shown to have been the victim of an arbitrary execution committed by the National Guard in Littleton on 26 June 1994. Similarly, John Smith, Jr., is shown to have been the victim of torture by the National Guard in Littleton on 3 July 1994. Notice that in this table, people have suffered one of three kinds of violation: arbitrary execution, arbitrary detention, and torture. These three kinds of violence together are called the controlled vocabulary of types of violence.
Of course it may have been the case that George Jones, who was assassinated by the Federal Police in Elmville on 4 July, 1994, was also detained and tortured before he was killed. However, in the controlled vocabulary scheme adopted here, killing is a more serious violation than detention or torture, and so only his killing is represented. If we attempt to represent only one violation per victim, we must have a way to choose which violation to represent if more than one violation happened. Is detention more or less serious than torture? The definition of "seriousness" between types of violations (e.g. 1. Execution, 2. Torture, 3. Detention) might be easy with only three kinds of violation in the controlled vocabulary. However, if there are 15 kinds, or 50 kinds of violation, the ranking of kinds of violation is prohibitively complicated. In fact, in my experience human rights organizations occasionally embroil themselves in lengthy and ultimately pointless debates about the hierarchy of violations. This is one case in which it is much more difficult to do this work wrong than it is to do it right.
The bigger problem with this kind of oversimplification is that it can distort analyses of trends. Consider the following table counted from the example above:
Source: Example 22.214.171.124
From this table, it would seem that the number of victims of detention declined from 2 in June to zero in July. But given the three cases of arbitrary execution that happened in July, we cannot be sure that this decrease is real. The assassinated people may have been detained and tortured before they were killed, in which case detention and torture actually went up in July. The rule that each victim is classified as having suffered only one kind of violation has obscured the complex combination of violent acts which may have occurred in each of the events represented by a single line. By the time the data have been represented as shown in Example 126.96.36.199, there is no way to find out from the coded information what actually happened in each event.
The basic representational strategy here is that each victim is thought of to suffer one and only one violation. In this example, I'm focusing on how this problem looks in database terms, and what the ramifications are for statistical outputs. However, if the interviewers who talk to witnesses note only one violation in their questionnaires, or focus only on the violation they think is the most important during the interview, precisely the same problem can arise. This is a logical problem. The problem is not limited to computer databases, or questionnaires, or statistics as such, but to the analytic framework an organization uses to remember events in a systematic fashion.
The problem is not solved by representing two violations per victim because, in a given case, if all three kinds of violence occurred, then one kind of violation will still be omitted. The only way to solve the problem is to permit every representational scheme at each step (including questionnaires, databases, etc.) to consider the possibility that any given victim may have suffered all of the violations in the controlled vocabulary set. In this example, it must be theoretically possible to represent a situation in which a person was detained, tortured, and killed. In summary,
For a given incident, the number of violations suffered by a given victim which can possibly be represented in the information management system must be greater than or equal to the number of violation types in the controlled vocabulary.
Another basic rule is not as easily reduced to a sentence. Whenever an organization builds an information management system to collect and record data about some things, whether they are violations, victims, detention periods, etc., they need to represent all of the things of this type that their sources tell them about. For example, if an organization designs a system to record information about detention periods, they might create a system that looks very much like Example 188.8.131.52 below. However, they will need to create a record for every one of the detention periods a particular victim suffers. If instead they choose to represent only the "main" detention period, they will arbitrarily exclude information about all the other detentions. The ramifications of so excluding certain detentions are analogous to the distortions discussed in this section. Any time an organization is tempted to record information about only one putative "main" thing among a group of possible things, they are almost certainly making an error of this kind. Again, though this is expressed in database terms, the problem can occur at any of the other steps of the information management system as well.