By Picturepark Communication Team • Mar 31, 2014
There’s no doubt about the value of controlled vocabularies. Time savings, ease of use and increased metadata integrity are only a few of the benefits vocabularies offer when used as metadata input filters. This post offers an overview of the value of controlled vocabularies and it describes how Picturepark supports this wonderful technology.
What is a Controlled Vocabulary?
For the purposes of this discussion, a controlled vocabulary is a set of predefined terms that serve as the only permitted values for a given metadata field. A controlled vocabulary can range in size from only a few terms to tens of thousands of terms.
- Metadata field: Gender
- Vocabulary: [male; female]
In this very simple case, the vocabulary contains only two terms. The value of defining these terms as a controlled vocabulary is that you can prevent the addition of values like man, woman, boy, girl or any spelling-error variation of those terms or the two permitted terms.
In the following examples, many more terms are permitted:
- Metadata field: Country
- Vocabulary: [Afghanistan; Albania; Algeria … Zimbabwe]
- Metadata field: Product Name
- Vocabulary: [all product names]
While Country values will remain fairly static, Product Name terms will vary as product lines change. For this reason, it must be possible (and easy) for permitted users to add, edit or delete terms where and when it makes sense to do so.
Metadata Field Assignments
When used with a digital asset management system, controlled vocabularies are typically assigned as data input filters for specific metadata fields. In other words, terms that are not included in the vocabulary cannot be entered into the field or, if they are entered, they are rejected by the system when the user attempts to save the record.
If the set of allowable terms for one metadata field are (and always will be) the same as those for another field, the fields can share a single vocabulary.
- Metadata field: Author
- Vocabulary: [Rolf; Hanne; Rita]
- Metadata field: Editor
- Vocabulary: [Rolf; Hanne; Rita]
Author and Editor are two distinct metadata fields but, because they will always refer to the same set of people, they can share a vocabulary.
When used in a digital asset management system, controlled vocabularies should also include synonyms for the primary term. This enables users who are not familiar with the vocabulary to still be able to find and apply the terms they need.
- Vocabulary term: Canine
- Synonyms: [dog; puppy]
In this case, if someone began typing “puppy” into the field, “Canine” would appear as a hint to what the permitted value is. The user can then accept the preferred term.
The preferred term in this vocabulary is Canine. But because users often enter dog or puppy, those terms have been added as synonyms to help the user choose the preferred term.
Synonym support is also valuable for common misspellings or alternate spellings.
- IBM finds International Business Machines
- mobile finds cellphone
- aproved finds approved
If your DAM is to be used across cultures that use a common language, synonym support can help standardize your metadata. For example, the United States and England both speak English; but they don’t always use the same terms for the same things.
- Lorry is in England what truck is in the United States
- Motorway is in England what freeway is in the United States
Regardless of which cultural references you standardize on as preferred terms, users will be able to enter valid metadata using the terms they know.
Controlled Vocabulary Localization
Language localization must be considered in your controlled vocabularies. Building on the canine example above, dog in Italian is cane. So when a user types “cane” into a field, the DAM needs to know the language in which the user is communicating. If the user is working in English, she does not mean dogs when typing cane.
This is less of an issue during search operations because the user will likely realize her input error when she sees search results that make no sense. But if language is not considered during metadata entry, the wrong terms can be added to the field, resulting in the asset becoming “digitally lost” in the DAM.
Localization concerns extend also to synonyms. When choosing puppy and dog as synonyms for canine, you must also consider whether there are synonyms for canine in the other languages that are used in the DAM and add them too.
This system has been configured for use in English and Italian. The preferred vocabulary terms (Canine and Canino) have been supplemented with synonyms so that users will be guided rather than frustrated, and the metadata will remain consistent.
Externally Editing Vocabularies
If you think that primary term plus synonyms multiplied by localization can make for a multidimensional array of values that can become cumbersome to manage, you are correct. For this reason, you must also have some means for editing vocabularies externally and then importing them into the DAM. Though it might be possible to make all the necessary edits within the DAM, it might be more convenient to use dedicated taxonomy software or just a spreadsheet program.
Values for the English (EN) vocabulary are on the left. Values for Italian (IT) are shown on the right. Using the various data-processing functions native to a spreadsheet, you could quickly localize and assemble hierarchical paths. Note that the format required for various DAMs will differ. Shown is a slightly modified variation of the format used by Picturepark.
Being able to access external data for vocabularies also enables you to leverage vocabularies that are built and managed by others. Key here is that you are able to import those vocabularies into your DAM.
Controlled vocabularies are typically imported via CSV files, which can be created in virtually any spreadsheet application.
Using Controlled Vocabularies in Picturepark
Picturepark supports all of the considerations mentioned above. The remainder of this post illustrates the steps involved with using vocabularies in Picturepark.
The Picturepark Tree
In addition to serving as a location for standard tags and classes, the Picturepark Tree is also used for controlled vocabularies. In fact, a controlled vocabulary in Picturepark is merely a reference to a branch of the Picturepark Tree. All tags below that reference point become the allowable terms for the vocabulary.
By leveraging the Picturepark Tree for the management of controlled vocabularies, users need no retraining and it’s easier to see at a glance all terms in each vocabulary. Note that vocabulary terms in the tree are typically visible only to vocabulary editors. Users assign terms using a text input area, shown below.
There are several advantages to this approach:
- There is nothing new for vocabulary editors to learn because managing vocabularies is no different than managing tags in the Tree.
- Permissions can determine who can manage vocabularies. Picturepark’s granular permissions system is included with every system, so you can always determine which groups can edit your vocabularies all the way down to specific terms, if necessary.
- It’s easy to visualize the vocabulary just by looking at the Tree. This can help editors determine where changes are required and it can help users learn the vocabulary faster. (Note that vocabularies in the Tree can also be hidden from certain user groups, if you prefer.)
Attaching a vocabulary to a field
Once you know which branch of the Tree you want to use for a metadata field’s vocabulary, you select it in the Picturepark Management Console. This is typically a one-time task performed by a system administrator during system setup.
Using the Picturepark Management Console, it’s easy to choose an existing vocabulary from the Picturepark Tree to be used for a given field.
Once the assignment has been made, the vocabulary is active on the field.
Assigning vocabulary terms to an asset
Once a vocabulary has been assigned to a field, it becomes a data input filter for the field. In order to ensure metadata integrity on the field, no user (including the system administrator), can then enter values into the field that are not within the vocabulary.
You can assign as many terms from the vocabulary as you need.
When users type into a field that has been assigned a controlled vocabulary, suggestions appear from the vocabulary for fast auto-complete. Terms not in the vocabulary will not be accepted into the field.
Vocabulary terms can be added or removed at any time. Multiple terms can be assigned to the same field.
Adding, editing and deleting vocabulary terms
Permitted users can add, edit and delete vocabulary terms in exactly the same way they would do so with other Tree items. New terms can be added directly within the Picturepark Tree or they can be added at the time a vocabulary term is assigned.
New terms can be created by permitted users either inside the Picturepark Tree or on-the-fly within the Details window.
When terms that have been previously assigned to records are edited, the affected records show the updated vocabulary terms. This enables you, for example, to change a product name or fix a typo that was found in the tag name. When assigned terms are deleted from the vocabulary, they are removed from any previously assigned records.
Collectively, these behaviors help ensure that your controlled vocabulary fields never contain values that are not currently in their assigned vocabularies. In other words, your metadata will never include values that are not currently in your controlled vocabulary.
Picturepark’s granular permissions enable you to define who can use or edit each term. In most cases, you’ll make an entire vocabulary available to users and you’ll define users who can edit it. But if you need to restrict the use of some terms, or you want to limit which of your editors can edit specific terms in the vocabulary, you can.
Permissions (rights) can be explicitly set for each term in your vocabularies. Permissions can be either locked to the settings in a template or they can be modified directly on the term (category). Not shown are the extended permissions that offer more granular control over the three permissions shown.
A Status vocabulary offers an example of when you might want to restrict access to certain terms. For example, your policy might require that only users in the “Art Director” group can assign the value “Approved for Use”.
Because the deletion of a vocabulary term can result in records that have no corresponding value, you might want to reserve the “delete” permission only for those users specifically trained to handle that situation.
Adding, editing and deleting vocabulary synonyms and localizations
Each vocabulary term can be assigned an unlimited number of synonyms for each supported language. For convenience, the editing of synonyms and localized values takes place in the same location.
Localization and synonym assignments are configured in the same location. Only permitted users will see and be able to edit these fields.
If edits are needed on a wider scale, such as when first setting up your Picturepark or when later adding support for an additional language, you might prefer to use a spreadsheet program to add the new values you need. This method also offers you a convenient means for auditing and approving synonyms and localizations before they are committed to Picturepark. Keep this option in mind if you work with freelancers because it enables them to build synonym lists and perform localizations without having needing access to your Picturepark.
Finding digital assets using vocabulary terms
Picturepark’s simple search option shows type-ahead suggestions based on metadata values and controlled vocabulary terms available within the system. So, when users enter search values into the search area, they will see existing metadata values and controlled vocabulary terms as suggestions.
When more specific searching is required, Picturepark’s Extended Search option enables users to choose from available vocabulary terms so there’s no guessing.
Complex searches can be created and shared with other users. If a controlled vocabulary has been assigned to a field, it will provide suggestions when the user types a value.
Learn More about Controlled Vocabularies
Here are some additional resources that can help learn more about the value of controlled vocabularies:
ControlledVocabulary.com – This site provides many resources that are related to the creation, management and use of controlled vocabularies.
Taxonomy vs. Controlled Vocabulary – All taxonomies are controlled vocabularies but not all controlled vocabularies are taxonomies. Confused? This article aims to define the two insofar as how they relate to digital asset management software.
Taxonomies, Thesauri and Controlled Vocabularies – This article aims to differentiate between taxonomy, controlled vocabularies, thesauri and ontologies. Note that this discussion is not presented from the perspective of digital asset management software.
If you’d like to see firsthand how controlled vocabularies are created, managed and used in Picturepark, contact firstname.lastname@example.org.