Skip to main content

SHARE

  • About SHARE
    • Our Community
  • Projects and Partners
  • News
  • Contact

SHARE News

Rick’s MetaTips | 20 November 2015

Rick’s MetaTips: SHARE Metadata Is Stitching Together the Research Life Cycle

Rick Johnson, image courtesy of University of Notre Dame Hesburgh Libraries

When repository managers or others inquire to SHARE about becoming SHARE Notify data providers, we often hear a few simple questions:

1. What metadata fields are most important for SHARE?

2. Do I need to write software to plug into SHARE?

3. If I am using OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting) how does that map to SHARE?

4. How do I get started with SHARE?

To begin to answer these questions and others, let’s start with how metadata relates to SHARE’s mission of making research more accessible, discoverable, and reusable as well as tracking research across its life cycle to show its impact.

What metadata fields are most important for SHARE and why?

SHARE’s schema may seem complex at first glance. However, it can be broken down into a few basic groups of metadata: bibliographic (e.g., title, description, subjects), contributor, license, sponsorship, and organization. Furthermore, SHARE requires only four pieces of metadata: title, contributor(s), uniform resource identifier (URI), and creation date. These four elements are necessary for SHARE to fulfill its primary mission of expanding access to research. However, to allow SHARE to stitch together research life cycle events for a given institution, researcher, or research project, it is the other, associated information that is vital.

The associated metadata elements for a person or work are often what provide the extra information that highlights and disambiguates the work of an individual institution, researcher, or project. For example, in addition to a contributor’s given name and family name, fields like institutional affiliation, unique researcher identifiers such as ORCID IDs, digital object identifiers (DOIs), sponsor, and award metadata are critical to connecting related work to create a mosaic of a single research project or research conducted by a particular researcher or institution.

Similar to how I described the importance of link durability and consistency in my previous article, “It’s All about the Links,” equally important is the consistent use of these uniquely identifying metadata fields across different systems as shared reference points. This is why connecting more and more ORCID, institution, and grant IDs to data within SHARE is one of many key strategies for the SHARE project moving forward.

Let’s jump into an example (many fields omitted for the sake of brevity):

{
    "providerUpdatedDateTime": "2015-11-13",
    "contributor": [
        {
            "name": "Richard Johnson",
            "sameAs": [
                "http://orcid.org/0000-0002-1550-6325"
            ],
            "familyName": "Johnson",
            "givenName": "Richard",
            "additionalName": "Patrick",
            "email": "rick.johnson@nd.edu",
            "affiliation": [
                {
                    "name": "University of Notre Dame"
                }
            ]
        }
    ],
    "description": "An overview of metadata elements and how they relate to SHARE.",
    "uris": {
        "canonicalUri": "http://doi.org/myDOI",
        "providerUris": [
            "http://doi.org/myDOI"
        ]
    },
    "title": "SHARE Metadata Is Stitching Together the Research Life Cycle"
}

This is a fairly complete example of contributor metadata set with elements like an ORCID ID (sameAs), and institutional affiliation (affiliation:name) that, in addition to standard name fields and creation date (providerUpdatedDateTime), provide the associated metadata necessary to link together both researcher- and organization-related work. Additionally, a DOI (uris:canonicalUri) for the work being described allows the ability to connect the work to other related works that reference that DOI.

OK, that makes sense about SHARE’s preferred metadata, but how do I map my metadata schema to SHARE (e.g., from OAI-PMH)?

Because many repositories provide harvestable metadata as Dublin Core fields via OAI-PMH, SHARE already has a standard set of mappings worked out for Dublin Core:

SHARE Fields Dublin Core Notes
contributors dc:creator or dc:contributor creator(s), contributor(s)
uris:canonicalUri dc:doi or dc:identifier unique links or identifiers
providerUpdatedDateTime ns0:header/ns0:datestamp or dc:dateEntry creation or updated date
title dc:title title
description dc:description description
subjects dc:subject subject(s)
publisher:name dc:publisher publisher name
languages dc:language language(s)

This default OAI mapping schema covers the standard bibliographic and contributor information. For examples of supplying sponsorship information one can look to the SHARE mapping that was developed for Department of Energy (DOE) metadata with the following additional field mapping:

SHARE Fields Dublin Core Notes
sponsorships:sponsor:sponsorName dcq:publisherSponsor sponsor name

Within the DOE metadata mapping schema there are also examples, in the table below, of how any metadata that does not immediately match SHARE’s schema can be received. The SHARE team will then map any such metadata to the “otherProperties” field within the schema.

*Note: Fields listed in the table as otherProperties are not standard fields for SHARE, and are merely examples of arbitrary fields that could be created per record.

SHARE Fields Dublin Core Notes
otherProperties:coverage dc:coverage geographic reference or other spatial data
otherProperties:publisherAvailability dcq:publisherAvailability publication date
otherProperties:publisherCountry dcq:publisherCountry country of publication
otherProperties:format dc:format data or file format (i.e., MIME type)

Do I need to write software or do any coding to get plugged into SHARE?

The quick answer is no. While it may appear on the surface that some coding may be necessary to map your elements to the ones described above, there is actually a team of developers on the SHARE project through the Center for Open Science that will do this work for you. They will create a harvester that points to your repository and periodically pulls updated information. In many cases where OAI-PMH is enabled there is little effort on their part to adapt already existing harvesters for your repository. If there is effort needed, the SHARE team will do the extra work to adapt a harvester to meet your repository schema. It really is as simple as submitting a request at  https://osf.io/share/registration/ or sending a note to share-support@osf.io to get started.

What are developing areas of metadata related to SHARE?

ORCID

As already mentioned, ORCID presents a unique opportunity to help SHARE connect related life cycle events for single research efforts, and wider activity of researchers. In turn, the SHARE team has been exploring some exciting work to enhance the SHARE data set (i.e., SHARE’s record set of research events) with links to ORCID IDs. This work has focused on a couple different strategies of querying the ORCID application programming interface (API) for potential matches and looking at ways for people to identify records to link to ORCID IDs. Because the ORCID API work has presented inconclusive matches in many cases, and the manual curation of metadata is a large task in and of itself, the SHARE operations team is currently discussing ways to combine these two approaches. The SHARE team sees an opportunity to have the SHARE data set automatically present possible matches that are then confirmed by a metadata expert. This solution would both amplify the benefit of SHARE itself (i.e., the benefit of connecting research events) and demonstrate a case where neither option by itself would come close to a solution. We look forward to seeing how this develops.

Community Efforts

As you may have guessed, the strategy outlined above for SHARE data providers still allows for a lot of variation in metadata schema across repository systems. Because of this, there are several international community efforts underway to develop greater alignment around the use of metadata in communities like the Confederation of Open Access Repositories (COAR), Consortia Advancing Standards in Research Administration Information (CASRAI), Digital Public Library of America (DPLA), Europeana, LA Referencia, OpenAIRE, Research Data Alliance (RDA), and SHARE.

COAR-CASRAI Working Group

The COAR-CASRAI Working Group is one such example in which SHARE is participating and that involves many of those communities just mentioned working towards common metadata sets focused initially around two main use cases:

Use Case 1

Tracking Open Access Research Outputs. As a funding agency participating in internationally co-funded projects, I want to track all publications resulting from my funded research and to identify which ones are openly available.

  • Funding agency ID/name
  • Project/grant ID
  • Title
  • Author names/IDs
  • Output type
  • Publication date
  • Open access status
    • accessibility
    • embargo
    • license)
  • Unique identifier for publication (e.g. DOI)
  • Jurisdiction information (funder country)
  • Source or provenance (repository identifiers, commercial providers, organization,  or registry name, other)
Use Case 2

As a repository manager I want to know when a paper by an author from my institution has been deposited at another repository so that I can either make a copy for my repository or contact the author to ask them to deposit a version locally.

  • Institution ID/name
  • Title
  • Author names/IDs
  • Output type
  • Publication date
  • Open access status (rights/embargo, license)
  • Unique identifier for publication (e.g. DOI)
  • Funder country (jurisdiction information)
  • Source or provenance (repository identifiers, commercial providers, organization,  or registry name, other)

The groups are initially defining mapping of existing elements to these fields while working towards recommendations to cover all desired metadata elements.

Other Areas to Watch

One other major area that consistently creates work for any repository community is determining, tracking, and maintaining an understanding of rights related to digital objects, from the perspective of content creators/owners as well as the content stewards, such as community or institutional repositories. There has been some movement in the community around developing common rights statements, including recently published Recommendations for Standardized International Rights Statements outlined by DPLA, Europeana, and Creative Commons.

OpenAIRE has also developed a set of semantic terms, controlled vocabularies, and identifiers that, when used in addition to common metadata elements, increase the ability to correlate similar values across systems.

Now that more of these standards are emerging, the impact will likely be clear as they go into larger practice.

So, how do I get started as a SHARE provider?

Simply complete the brief registration process at https://osf.io/share/registration/ or send a note to share-support@osf.io and someone will contact you to go through the necessary steps to start harvesting from SHARE. In many cases, this process will be less about SHARE being able to harvest your data and more about confirming the metadata is cleared to be harvested by SHARE. By adding your repository’s metadata to the SHARE data set and following these simple guidelines, you will make your institution’s research more accessible, discoverable, and reusable, as well as more trackable across its life cycle.

By Rick Johnson

rick.johnson@nd.edu
Tags metadata, SHARE data providers
  • Related Posts

    • March 1, 2018SHARE v3: Rebooting the Metadata-Harvesting Framework, Metadata-Editing Pipeline

      Jeffrey Spies, SHARE’s co-director and the original architect of both SHARE and the Open Science Framework (OSF), will be ... read more.

    • January 26, 2018Technical Update: January 2018

      The SHARE developers have enhanced SHARE over the past few months, by back-harvesting a variety of metadata providers, and ... read more.

  • Topics

    • Uncategorized (2)
    • Events (37)
    • SHARE News Releases (22)
    • Partners (23)
    • Career Opportunities (5)
    • SHARE Updates (41)
    • What people are saying (16)
    • Presentations (23)
    • Resources (19)
    • Rick’s MetaTips (8)
    • General (11)
  • @SHARE_research

    Tweets by @SHARE_research
  • About SHARE
  • News
  • Contact
Sign up for updates
@SHARE_research

All content is © copyright SHARE and available under a CC-BY 4.0 license.

Association of Research Libraries
21 Dupont Circle NW #800
Washington, DC 20036
202-296-2296
info@www.share-research.org
  • Credits
  • Accessibility
  • Privacy Policy
  • Brand Guidelines
  • Dashboard
This site uses cookies. By clicking 'I understand', you are agreeing to our use of cookies. More Info...
I Understand
Privacy & Cookies Policy
Necessary
Always Enabled