Handle with Care: Metadata in Scholarly Publishing

It’s readily apparent that metadata is an essential part of scholarly publishing. So why do we let so much of this treasured commodity slip through our fingers over the course of the publication process?

Each portion of the publication lifecycle requires important metadata, but not all of this information is carried all the way through the workflow. Instead, much of it remains in the isolated silos in which it’s collected. Inera’s CEO Bruce Rosenblum notes, “There’s just form after form after form of metadata collected [in submission systems] and it’s amazing how little of that makes it through to the final XML or beyond.”

For example, did you know that ORCID IDs (i.e. author IDs) often don’t make it out of the submission system? And that when publishers produce XML from manuscripts, Ringgold IDs collected at submission for author affiliations are often lost, effectively expunging hugely important data from publisher records?

But the lack of synchronization across publication phases—and the subsequent loss of this important metadata—persists. Ringgold’s North American Sales Director, Christine Orr, comments, “It negatively impacts all kinds of things downstream, and results in a lack of discoverability, lack of inoperability between other systems, and the inability to really, truly analyze your author base.” And it makes the publication workflow rife with inaccuracies. Bruce Rosenblum notes, “If it’s not automatically integrated into the workflow, then it’s a much more manual process, and hence a potentially inaccurate process.”

This information matters to both publishers and funders. Having unbridled access to the complete set of metadata collected throughout the publication lifecycle would mean infinitely better information about not only authors but also grant appropriation. It would enable better business analysis by publishers and funders alike, and would help all stakeholders identify trends in areas like open access, measure the impact of funding and make more informed decisions. Rosenblum notes, “Publishers need to understand there’s a huge value in integrated metadata. And by integrated, I mean that its shareable across systems.”

So what are we—the scholarly publishing community—waiting for? We need to begin by handling our existing metadata with care. And we need to invest in building out metadata-handling processes—holistically and systematically— within our own organizations to prepare for additional standards on the horizon. Finally, we need commitment from stakeholders across the scholarly publishing industry to use these standard identifiers that are being lost most often; namely grant IDs, funder names and author and co-author affiliation IDs.

Let’s continue the conversation at this year’s SSP Meeting in Chicago. Join me and fellow industry experts (listed below) as we analyze the research workflow, identify gaps, and discuss pragmatic ways we can work together to make the publication workflow more seamless and beneficial for all stakeholders.

Hope to see you in Chicago.

SSP Session Information:

Session 1D
The Gift That Keeps on Giving: Metadata & Persistent Identifiers Through the Research and Publication Cycle

Thursday, May 31 at 10:30AM
Virtual Session

Christine Orr, Ringgold
Bruce Rosenblum, Inera
Sarah Whalen, AAAS
Mary Seligy, Canadian Science Publishing
Howard Ratner, Chorus
Jennifer Goodrich, Copyright Clearance Center