Content Management White Paper: Automated Metadata

Automated metadata tagging is becoming more common in content management. The premise of such a technology is alluring: Send an image or video to a service and have it send back a selection of tags that describe the content.

This can save countless hours of human tagging when used with content and use cases that are suitable for automated tagging. When automation is not suitable, what is returned from the tagging service can be humorous at best, or misleading at worst.

Three important things to consider before adding automated tagging to your content system are:

  1. Have you tested the auto-tagging using your actual content?
  2. Do you have an approval workflow in place to verify automatically added tags?
  3. Have you legally reviewed the terms of such service auto-tagging provider?

Only by sending your own content to the tagging service can you properly value the results you will get back. Among the things to look out for are the tags themselves — are they accurate? In addition, do they provide the level of detail that your users will need?

With generic or stock content, it might be enough to say that an image contains a “man” and a “tractor,” perhaps throwing in the tag “green” to describe the tractor. But if your business is building green tractors, “green tractor” won’t likely be descriptive enough to suit the needs of your users.

For example, terms that describe a tractor’s purpose, engine horsepower or lift capacity would be common search criteria, but no automated service is likely to be able to provide these values.

You could decide to add these tags manually, but then you have to factor in the value of the core automated service. Did it save much manual effort? Did it introduce errors that had to be corrected by a human?

Approval workflows are common during content production, but they are rarely used for metadata editing. While everyone has an opinion about the way an image looks, people usually assume someone else will be able to tag it adequately.

But to ensure metadata accuracy across your content system, you must have some means for verifying (and potentially overriding) any automated metadata service, such as auto-tagging.

Ideally, any erroneous tags users remove, along with new ones they add, will be fed back to the auto-tagging service to help it “learn.” This not only enables the service to become more valuable for use with your specific content, it can prevent the same wrong tags from being assigned in the future, which will save your users time.

If your system can flag content that has been auto-tagged, this might be all you need. A metadata editor finds the auto-tagged content, checks the tags for accuracy, makes changes as needed, and marks the record “verified,” or some other status that lets users know they can rely on the metadata therein.

Another time saver can be using a tagging service to subgroup content that a human subject matter expert (SME) can then describe further. For example, while the tagging service might not be able to identify specific product variations, it can likely tell the difference between your tractors and your lawn mowers. By doing some high-level categorization, you can assign each subgroup to the appropriate product manager or other SME for further details.

As a rule, the more generic your content or tagging requirements are, the more value you can derive from an auto-tagging service. In addition, the larger your collections, the more enticing such a service might be.

But if your content system manages medical content for veterinarians, “puppy with red ball” won’t likely be a suitable tag to describe an image intended to illustrate a breed, stage of life or visible signs of disease.

Dieser Blog ist nur in Englisch verfügbar – bitte wechseln Sie die Sprache.

Be the First to Learn.

Interested in getting notified about new blogs and other news from Picturepark? Follow us on Twitter, Linkedin or Facebook, and subscribe to our monthly newsletter.

Picturepark News

We'll send you a monthly update of what is happening with Picturepark and the Digital Asset and Content Management industry.