Metadata 2020 Update: Project Groups Underway

Increasingly, we are asking metadata to do more than ever before. In the digital age, there is a growing view that content which cannot be discovered, linked or acquired electronically may as well not exist. Demands are increasing for content to become more interoperable, discoverable and machine readable and we have a parallel challenge to manage all aspects of the underlying metadata across content creators, aggregators and consumers.

Metadata 2020 is a collaboration that advocates richer, connected, and reusable, open metadata for all research outputs, which will advance scholarly pursuits for the benefit of society.

The Metadata 2020 initiative kicked off in 2017 as a set of industry communities discussing common challenges, and while its name implies that it’s a three-year project, the outcome of its efforts won’t stop there. Working towards a shared vocabulary, set of best practices and awareness for the greater good, this multifaceted effort is designed to facilitate communication between disparate communities. Because our scope is broad, the findings of Metadata 2020 are less about being prescriptive, and more about bridging gaps in understanding, technology and workflows that impede research, publishing or re-using content.

Last year’s community groups identified six key challenges to focus on. The 2018 project groups span the lifecycle of metadata: research, metadata elements and their definitions, understanding incentives for improving metadata, and best practices each group can follow to support the larger ecosystem. Overall, each project team shares a common overarching goal: educating people on why it’s important to care about and invest in rich metadata.

CCC’s services exist at the crossroads of numerous metadata uses including content management, discovery, rights licensing, text and data mining, open access and content delivery. This broad experience allows us to bring a unique viewpoint to the Metadata 2020 initiative and we have several staff members are participating in Metadata 2020’s project groups during 2018.

Each project group varies in size from a few people to two dozen and includes volunteers from across the industry with varying backgrounds, expertise and motivations.

Highlights from CCC’s involvement in Metadata 2020’s project groups include:

Group 3: Defining the Terms We Use About Metadata

Elizabeth Wolf, Manager, Data Quality, Data Operations

I have always been interested in the intersection between perspectives. In the past, I’ve been involved in integrated projects where one team uses a term and another team assumes a completely different meaning, causing misaligned features or requirements, missed hand-offs, and delays. The more we work in the wide world of cross-functional teams and release trains, supporting a range of customers across many disciplines, the more critical it is that we recognize and address these challenges.

While the Defining the Terms group is closely aligned with others, our mission is to come up with clarifying terminology so that we can have more meaningful global discussions. To understand what should be delivered and why anyone should care, we need a common vocabulary. We are looking to facilitate communication about metadata within and between communities. Our 16 group members represent Service Providers/Platform & Tools, Publishers, Librarians, and Researchers.

At this point, we are surveying different user groups to assess what people talk about when they talk about metadata. We think our contribution is to disambiguate and illustrate what terms mean, independent of implementation. Our anticipated outcome is a glossary, which will be released along with Group 2’s mapping project.

Group 4: Incentives for Improving Metadata Quality

John Brucker, Metadata Librarian, Data Operations

As a metadata librarian, this project appealed because I think it can help address some of those inconsistency issues by helping the community understand the importance of metadata quality. I believe the community needs to commit resources towards creating and maintaining good metadata.

The mission of our group is to highlight downstream applications and the value of metadata for all parts of the community by telling real stories as evidence of how better metadata will support their goals.

Through my role at CCC, I can see that the quality of metadata we receive from our publishers can vary greatly. This is especially true for publication types other than books or journals, such as reports, websites, and standards.

The way I see it, this group will impact the industry by educating the industry about why they should care about metadata. Examples of this would be use cases where high-quality metadata positively impacts revenue, discoverability, and user experience.

Group 6: Metadata Evaluation and Guidance

Stephen Howe, Product Manager, Platform Services, Product

I was immediately drawn to this project because it aligns directly to what CCC is doing today and what I am doing at CCC. We just implemented a new works management system to help us improve the quality of our data. One of our biggest challenges is understanding exactly how to measure the quality of works’ metadata and to help our data source partners understand and measure the quality of the data that they send us.

The stated mission for this project is, “To identify and compare existing metadata evaluation tools and mechanisms for connecting the results of those evaluations to clear, cross-community guidance.” To state that in my own words, the point of this group is to define a common approach or toolset in which anyone can measure and report on the quality of metadata. Quality here is defined as completeness, accuracy, and consistency.

If we are successful, we will have better industry understanding on how to evaluate the quality of metadata and perhaps even a shared methodology / toolset with which to measure it.

Check back in November 2018 for the next update from CCC’s members of the Metadata 2020 team, reporting on the completion of the project groups.

Related Reading: