A controlled vocabulary is an organized arrangement of words and phrases used to index content and/or retrieve content through browsing or searching. It comprises selected and alternative terms and has a specified scope or explains a specific domain. 

Objective of Controlled Vocabulary 

The objective of controlled vocabularies is to manage information and to provide terminology to collect and retrieve information. While capturing the depth of alternative terms, controlled vocabularies also support uniformity in selected terms and the assignment of the same terms to similar content. 

Types of Controlled Vocabulary 

  • Relationships in general: The term relationship means a state of connectedness or a relationship between two things in a database. 
  • Subject Heading Lists: Subject headings are uniform words or phrases intended to be assigned to books, articles, or other documents to describe the subject or topic of the texts and to group them with texts having similar subjects. Subject heading lists are typically arranged in alphabetical order, with cross-references between the preferred, nonpreferred, and other related headings. 
  • Controlled Lists: A controlled list is a list of terms used to control terminology. In a good-constructed controlled list, each term is distinctive; terms do not have the same meaning; terms are all members of the same class; terms are equal in granularity or specificity, and terms are arranged alphabetically or in another logical order. 
  • Synonym Lists: A synonym list is a set of terms that are the same for the purpose of improvement. The same relationships in controlled vocabularies should be made between terms and names that have the same meanings. 
  • Authority Files: An authority file is a set of familiar names or headings and cross-references to the ideal form from varying or alternate forms.
  • Taxonomy: A taxonomy is an organized arrangement for a defined field. It comprises controlled vocabulary terms (generally only preferred terms) arranged into a hierarchical structure. Each term in a taxonomy is in one or more parent/child (broader/ narrower) connections to other terms in the taxonomy. There can be different types of parent/child relationships, such as whole/part, genus/ species, or instance relationships. A taxonomy may differ from a thesaurus but in general, it has narrower hierarchies and a less complex structure.  
  • Alphanumeric Classification Schemes: Alphanumeric arrangement schemes are monitored codes (letters or numbers, or both letters and numbers) that represent concepts or headings. They have an implicit taxonomy that can be theorized from the codes. 
  • Thesauri: A thesaurus combines the characteristics of synonym ring lists and taxonomies, together with additional features. A thesaurus is a network of unique ideas, including relations between synonyms, broader and narrower (parent/child) contexts, and other related concepts. Thesauri may be monolingual or multilingual. Thesauri may contain three types of relationships: equivalence (synonym), hierarchical (whole/ part, genus/species, or instance), and associative. 
  • Ontologies: An ontology is an official, machine-readable description of a theoretical model in which concepts, properties, relationships, functions, constraints, and axioms are defined. Such an ontology is not a controlled vocabulary, but it uses one or more controlled vocabularies for a distinct domain and articulates the vocabulary in an illustrative language that has a grammar for using terms to convey meaningful information. 
  • Folksonomies: Folksonomy is a new word referring to a grouping of concepts represented by terms and names (called tags) that are assembled through social tagging. Social tagging is the decentralized exercise and technique by which individuals and groups make, manage, and share tags (terms, names, etc.) to explain and classify digital resources in an online social environment. This method is also referred to as social categorization, social indexing, mob indexing, and folk categorization. 
  • MetadataMetadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. The different types of data are:
    • Descriptive metadata – the descriptive information about a source. It is used for finding and recognition. It includes elements such as title, abstract, author, and keywords. 
    • Structural metadata – metadata about containers of data and suggests how multiple objects are put together, for example, how pages are ordered to form chapters. It describes the kinds, versions, connections, and other qualities of digital resources. 
    • Administrative metadata – the data to help manage a resource, like resource type, authorizations, and when and how it was produced.
    • Reference metadata – the data about the contents and quality of statistical data. 
    • Statistical metadata, also called process data, describes processes that collect, process or produce statistical data. 
    • Legal metadata – provides data about the creator, copyright holder, and public licensing. 


Innovatia is an end-to-end content solutions provider servicing clients looking to manage and overcome challenges with their content.  For more than two decades, our experts have worked closely with client teams to help design, transform, and manage their content with a view to driving business goals through knowledge and content solutions. To discuss in more detail, contact us.