Controlled vocabulary

What is a controlled vocabulary?

A controlled vocabulary provides the ability to transform information that has been collected on violations, victims, and perpetrators into a countable set of data categories. It is important that this process be done without discarding relevant information and without misrepresenting the collected information.

Why is it necessary?

The data collected about human rights violations originates from a wide range of information sources – legal case files, newspaper articles, e-mails, faxes, letters, phone conversations, testimonies, interviews, radio and television programs, video clips, and photos. This wide range of sources may entail large variations in the detail, accuracy, and verifiability of violations.

The ability to utilize widely varying sources of information can increase the coverage of violations, but also increases the diversity of the data used when establishing a controlled vocabulary. This diversity creates a challenge as the differences in the quality of data and the complexity of the process of coding raw information into human rights data can be difficult to manage.

The variability in detail and accuracy of source information requires a systematic approach to managing the quality of the data and producing meaningful statistics about human rights violations, perpetrators and victims. A controlled vocabulary provides a framework to transform qualitative information into a countable set of data that represents the nature, scope and intensity of the human rights violations. At the same time, a controlled vocabulary can lessen the likelihood the data that has been input will be extrapolated beyond its significance.

What is the process for creating a controlled vocabulary?

In order to create a controlled vocabulary and ensure the quality of the data, every violation definition must satisfy the following five properties:

Mutually exclusive: No single violation (or victim or perpetrator) can fit into any two definitions in the controlled vocabulary

Exhaustive: A definition must exist for every possible violation that can occur in the situation being studied.

Distinguished: Each definition must have an explicit characteristic that distinguishes the violation/victim/perpetrator from all others in the controlled vocabulary.

Exemplified: Each definition must be accompanied by examples showing how to apply the definition in a specific situation.

Countable: Each definition must contain a counting rule explicitly stating how violations, victims, and perpetrators are enumerated.

How is it used?

The statistics generated through the use of a controlled vocabulary enable the researchers to decipher the often-complex relationships between violation, victim, and perpetrator, and ultimately help to answer the question of “Who did what to whom?” The controlled vocabulary transforms the collected information into a countable set of data categories, without discarding important information and misrepresenting the collected information. Researchers can use these statistics to produce a more systematic overview of the totality of large-scale human rights violations.

Return to Core Concepts page

Our work has been used by truth commissions, international criminal tribunals, and non-governmental human rights organizations. We have worked with partners on projects on five continents.