Metadata Basics

DCMI logo Metadata Basics

The word "metadata" means "data about data". Metadata articulates a context for objects of interest -- "resources" such as MP3 files, library books, or satellite images -- in the form of "resource descriptions". As a tradition, resource description dates back to the earliest archives and library catalogs. The modern "metadata" field that gave rise to Dublin Core and other recent standards emerged with the Web revolution of the mid-1990s.

You can learn more about metadata and DCMI by exploring the pages listed in the menu bar above: the Home page, Metadata Basics (this page), Specifications, Community and Events, and About Us.

DCMI logo Background

Early Dublin Core workshops popularized the idea of "core metadata" for simple and generic resource descriptions. The fifteen-element "Dublin Core" achieved wide dissemination as part of the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) and has been ratified as IETF RFC 5013, ANSI/NISO Standard Z39.85-2007, and ISO Standard 15836:2009.

Starting in 2000, the Dublin Core community focused on "application profiles" -- the idea that metadata records would use Dublin Core together with other specialized vocabularies to meet particular implementation requirements. During that time, the World Wide Web Consortium's work on a generic data model for metadata, the Resource Description Framework (RDF), was maturing. As part of an extended set of DCMI Metadata Terms, Dublin Core became one of most popular vocabularies for use with RDF, more recently in the context of the Linked Data movement.

The consolidation of RDF motivated an effort to translate the mixed-vocabulary metadata style of the Dublin Core community into an RDF-compatible DCMI Abstract Model (2005). The DCMI Abstract Model was designed to bridge the modern paradigm of unbounded, linked data graphs with the more familiar paradigm of validatable metadata records like those used in OAI-PMH. A draft Description Set Profile specification defines a language for expressing constraints in a generic, application-independent way. The Singapore Framework for Dublin Core Application Profiles defines a set of descriptive components useful for documenting an application profile for maximum reusability.

DCMI logo Metadata Training Resources

For an overview of DCMI Webinars and Tutorials, given at Dublin Core Conferences and other events, please see the Metadata Training Resources page.

DCMI logo "Levels of interoperability"

From the perspective of the Dublin Core community, the metadata landscape is currently characterized in terms of four "levels" of interoperability:

Level 1 (Shared term definitions). At Level 1, interoperability among metadata-using applications is based on shared natural-language definitions. Within an application environment such as an intranet, library system, or repository federation, participants agree what terms to use in their metadata and how those terms are defined. Terms are hard-wired into applications using specific implementation technologies, and interoperability with "the rest of the world" outside of the implementation environment is generally not a priority. Most existing metadata applications currently operate at this level of operability.

Level 2 (Formal semantic interoperability). At Level 2, interoperability among metadata-using applications is based on the shared formal model provided by RDF, which is used to support Linked Data. As defined in Wikipedia, the term "Linked Data" describes "a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs [Web addresses] and RDF." The properties and classes of DCMI Metadata Terms have been defined for compatibility with Linked Data principles. In recent years, vast amounts of commercial and public-sector data have been added to a growing linked data cloud. Search engines such as Google, Yahoo, Bing, and Yandex and content-management platforms such as Drupal have implemented support for RDFa, a method for exposing linked data embedded in Web pages. In effect, the founding idea of Dublin Core -- "simple metadata for resource discovery" -- is being reinvented under the banner of "structured data for search engine optimization". Of the four interoperability levels, this one appears to be growing the fastest.

Level 3 (Description Set syntactic interoperability) and Level 4 (Description Set Profile interoperability). At Level 3, applications are compatible with the Linked Data model and, in addition, share an abstract syntax for validatable metadata records, the "description set". At Level 4, the records exchanged among metadata-using applications follow, in addition, a common set of constraints, use the same vocabularies, and reflect a shared model of the world. Levels 3 and 4 are less common in practice than Levels 1 and 2 inasmuch as they are not as well supported with software tools, though the problems addressed in this work are expected to grow in importance as producers of metadata records move their information into a linked-data environment.

To the reader : If you are evaluating implementation options, it is good to start by defining your requirements:

Next steps :

Interoperability Levels