DOI Implementation Procedures

MOVE TO SUNDOG per GS and AR 28 Jun 2022

As approved by the EOL Management Committee Nov 14, 2014

In an effort to lead our community to follow modern data citation practicesF,G,H,I,J,K, EOL has developed a strategy to implement Digital Object Identifiers (DOIs) for EOL-hosted Lower Atmosphere Observing Facilities (LAOF) and EOL-provided datasets. DOIs will specifically help EOL track the use of LAOF in support of field campaigns and, more importantly, the datasets collected during those projects as they are used in publications and derivative works. DOIs have the added benefit of providing recognition to the authors whose creativity and work lead to high-quality datasets. This document details the strategy being pursued by EOL to implement DOIs and the schema (i.e., metadata fields) required. It also outlines how DOIs are used within EOL to enable tracking of metrics.

Assignment of DOIs to Lower Atmosphere Observing Facilities (LAOF)

EOL has assigned DOIs to NCAR-managed Lower Atmosphere Observing Facilities. LAOF are NSF-funded facilities that can be requested by the community and are supported by the NSF Deployment PoolA. EOL has also assigned a DOI to the NCAR Field Catalog, which is routinely requested in support of field campaigns. At this point in time, DOIs will not be assigned to any other EOL instruments or platforms.

List of DOI-assigned LAOF and Services (Status as of August 2015)

  • NSF/NCAR GV (HIAPER)
  • NSF/NCAR C-130
  • NCAR S-band/Ka-band Dual Polarization, Dual Wavelength Doppler Radar (S-PolKa)
  • NCAR HIAPER Cloud Radar (HCR)
  • NCAR High Spectral Resolution Lidar (HSRL)
  • NCAR Integrated Sounding System (ISS)
  • NCAR Integrated Surface Flux System (ISFS)
  • NCAR GPS Advanced Upper-air Sounding System (GAUS)
  • NCAR Airborne Vertical Atmospheric Profiling System (AVAPS)
  • NCAR/EOL HAIS Instruments (14 separate DOIs)
    • Note: primary Authorship for HAIS datasets is given to the HAIS PI, not EOL
    • 3V-CPI
    • Autonomous Airborne Ozone Photometer
    • AWAS
    • FO3
    • HARP
    • HSRL
    • GISMOS
    • GT CIMS
    • MTP
    • QCLS
    • SID-II
    • ToF-AMS
    • TOGA
    • VCSEL
  • NCAR Field Catalog

Although the following platforms and instruments are also part of the NSF LAOF, they are not hosted by NCAR and are not included in the EOL DOI Implementation Strategy:

  • University of Wyoming King Air (UWKA)
  • Wyoming Cloud Radar (WCR)
  • Wyoming Cloud Lidar (WCL)
  • Center for Severe Weather Research Doppler on Wheels (DOW)
  • Center for Severe Weather Research Pods
  • Center for Severe Weather Research Mobile Mesonets (MM)

Assignment of DOIs to EOL Datasets

A DOI will be assigned to every final EOL dataset that is entered into the EOL Metadata Database and Cyberinfrastructure (EMDAC) system once an internal metadata review has been completed. DOIs are assigned to each dataset before public release and only to final datasets. A DOI remains in effect for the lifetime of a dataset, including any revisions due to reprocessing or recalibration. In case of a revision, a new version number will be assigned and included in the citation. All data files will be maintained within the EMDAC system; however, older versions will be hidden from viewing and ordering.

An example citation will be created automatically by EMDAC and displayed on each dataset description page along with a link to interactively reformat the data citation based on a specific journal requirement (e.g., AMS, AGU, BAMS). The citation can be copied and pasted and should be used in publications and posters. This approach complies with NCAR Technical Note NCAR/TN-492+STRB, which describes procedures for creating DOIs within NCAR.

EOL is committed to assigning DOIs to all datasets collected by EOL-managed LAOF and instruments during field campaigns that were carried out in FY 2005 and later. While EOL will assign DOIs to earlier datasets when they are requested, it is not feasible to implement DOIs for all existing EOL datasets at this time. Given the effort involved in completing metadata, including researching authors, this is especially true for past datasets that are hosted by EOL but were collected by PI-provided instrumentation. 

Schema/Metadata fields

The goal of a DOI is to provide a permanent digital identifier, replacing more commonly used URL citations which can routinely change. The EMDAC database provides searchable citations that include DOI identifiers for EOL-hosted data, allowing mining of EOL DOIs to find LAOF (S-PolKa, GV, HSRL, etc.), field project IDs, instrument configurations and software used. Since EOL DOIs are all registered with DataCite, they can also be mined through the DataCite database.

For EOL-assigned DOIs, users will be able to follow the DOI from the DataCite or EMDAC database to the EOL LAOF or EMDAC dataset landing page. There they will get information related to the instrument used, its configuration, operating parameters, filtering methods, software versioning, the associated field project, the responsible lead person, and quality control procedures.

The content associated with each DOI is predetermined by the DataCite Metadata SchemaC, which defines a set of mandatory properties, i.e., metadata, that are necessary to create a DOI.

An NCAR-wide committee met in May 2014 and laid out how the DataCite metadata schema can and should be used for NCAR dataD. The following recommendations conform to these NCAR guidelines.

Please note that for each metadata field, we list the metadata to be used for EOL-hosted LAOF and EOL-hosted datasets separately.

Creator/AuthorE 

LAOF: UCAR/NCAR - Earth Observing Laboratory (from UCAR committee)

Datasets: UCAR/NCAR - Earth Observing Laboratory (from EMC)

LAOF and Dataset Example: “UCAR/NCAR - Earth Observing Laboratory”

Primary authorship given to PI and co-PIs when from another institution and they were instrumental in the development

 

Title

LAOF: Facility name (names given above)

Datasets: Read directly from EMDAC dataset title

LAOF Example: "NSF/NCAR C-130"

Dataset Example: “Low Rate Navigation, State Parameter, and Microphysics Flight-Level Data”

Publisher

LAOF: UCAR/NCAR - Earth Observing Laboratory

Datasets: UCAR/NCAR - Earth Observing Laboratory

LAOF and Dataset Example: “UCAR/NCAR - Earth Observing Laboratory”

Publication Year 

LAOF: Year became available (in square brackets following names above)

Datasets: The year at the moment the DOI is created.

Landing Pages 

LAOF: Facility landing page in Drupal

Datasets: EMDAC dataset landing page

LAOF Example: https://www.eol.ucar.edu/instrumentation/aircraft/C-130

Dataset Example: http://data.eol.ucar.edu/codiac/dss/id=232.001

Subject 

LAOF: ISO keywords, GCMD keywords, if applicable

Dataset: ISO keywords, keywords from codiac (category, platform, project name)

Contributor 

LAOF: Facility listed as Research Group, NSF listed as Funder

Dataset: all other contacts we have for a dataset, other than Author and internal contact. For example, instrument PI, scientist requesting instrument, person who performed data processing and quality assurance. Facility is listed as HostingInstitution.

Contributor type must be listed in square brackets after the contributor name. The comprehensive list of possible values for contributor type is: ContactPerson, DataCollector, DataManager, Distributor, Editor, Funder, HostingInstitution, Producer, ProjectLeader, ProjectManager, ProjectMember, RegistrationAgency, registrationAuthority, RelatedPerson, Researcher, ResearchGroup, RightsHolder, Sponsor, Supervisor, WorkPackageLeader, Other

Example: “University Corporation For Atmospheric Research (UCAR):National Center for Atmospheric Research (NCAR):Earth Observing Laboratory (EOL):Data Management & Services Facility (DMS) [HostingInstitution]”

Date

LAOF: N/A

Dataset: data begin and end dates

Language 

LAOF: English

Dataset: English

ResourceType

LAOF: PhysicalObject

Dataset: dataset

AlternateIdentifier 

LAOF: Manufacturer or serial number. For aircraft use tail number

Dataset: local archive id (codiac id)

RelatedIdentifier 

LAOF: N/A

Dataset: documentation, xlinks (can be related DOIs), could put link to project page

Size 

LAOF: N/A

Dataset: size of dataset from codiac

Format

LAOF: N/A

Dataset: format of data files from codiac

Version

LAOF: generally N/A

Dataset: from codiac version field

Rights

LAOF: decided by UCAR DOI committee

Dataset: defaults in codiac

Description

LAOF: cull from facility landing page

Dataset: dataset summary

LAOF Example: https://www.eol.ucar.edu/instrumentation/aircraft/C-130

Dataset Example: http://data.eol.ucar.edu/codiac/dss/id=232.001

 
GeoLocation 

LAOF: N/A for aircraft, could put site of fixed site facilities

Dataset: codiac bounding box

Additional information about EOL data are stored at EOL in the EMDAC data system, the project websites, and the instrumentation pages.

Metrics

Metrics can be mined starting with the dataset, the LAOF, or the EOL project pages. An author citing a dataset references the LAOF used to collect the data via the dataset metadata. Conversely, for a survey paper, a reference to the LAOF deployed, or the project page, will reference every single DOI within the project. To reference a project, one can use a standard web page reference to the EOL project page.

A list of every field project an instrument was deployed for is available via EMDAC (e.g. http://data.eol.ucar.edu/codiac/dss/plat=350 for platforms) or the Drupal instrument pages (e.g. https://www.eol.ucar.edu/instruments/microwave-temperature-profiler - click on the data tab).

It is important to note that it is impossible to predict every scenario. Non-conforming datasets are handled on a case-by-case basis as they arise. While the DOI itself is expected to be persistent (permanent), its metadata can be changed, and lessons will be learned as we proceed.

References

  1. Request Lower Atmosphere Observing Facilities Accessed Nov 2014 Available online from

    https://www.eol.ucar.edu/deployment/request-info/forms/request-forms-for-nsf-lower-atmospheric-observing-facilities

  2. NCAR Technical Note NCAR/TN-492+STR Available online from

    http://dx.doi.org/10.5065/D6ZC80VN

  3. DataCite schema V3.0 Accessed Nov 2014 Available online from

    http://schema.datacite.org/meta/kernel-3/doc/DataCite-MetadataKernel_v3.0.pdf

  4. NCAR’s implementation doc for the DataCite schema

    https://drive.google.com/a/ucar.edu/?tab=mo#folders/0BzpAHQJfWJ0TWFpnUmtoYUNvcGc

  5. EMDAC DOI Creator/Contributor implementation details. Accessed Nov 2014. Available on intranet at http://data.dev.eol.ucar.edu/doi-cons.htm

  6. Reproducible Research: Addressing the Need for Data and Code Sharing in Computational Science, 17 June 2010. Accessed Nov 2014. Available online at http://www.stanford.edu/~vcs/Conferences/RoundtableNov212009/RoundtableOutputDeclaration.pdf

  7. Data Citation Guidelines for Data Providers and Archives, 1 Mar 2012. Available online at http://commons.esipfed.org/node/308

    1. These were used as the basis for the UCAR/NCAR Technical Note (Reference A in this list)

    2. AGU gives these as suggested guidelines to follow OSTP public access memo "Increasing Access to the Results of Federally Funded Scientific Research", 22 Feb 2013, Available online at https://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf

  8. OSTI, http://www.osti.gov/

  9. DRAFT NOAA Data Citation Procedural Directive, Private Communication [Now available online at https://www.nosc.noaa.gov/EDMC/PD.DC.php]

  10. Matthew S. Mayernik, Mohan K. Ramamurthy, and Robert M. Rauber, 2015: Data Archiving and Citation with AMS Journals. DRAFT 4 [Now available online at J. Atmos Sci. 72, 1281-1282 doi: http://dx.doi.org/10.1175/2015JAS2222.1]

  11. Federation of Earth Science Information Partners (ESIP) data citation recommendations, 19 Sep 2012 Available online from http://bit.ly/data_citation

Appendix A | Examples from across the community

General information on DOIs:

  • http://www.doi.org/factsheets/DOIKeyFacts.html
  • http://ezid.cdlib.org/home/understanding
  • Carl Drews: How to Archive Data with a Journal Publication
  • https://github.com/blog/1840-improving-github-for-science
  • Github repositories can now be assigned DOIs. Data and code can now be cited. They are also providing a free plan with private repos for individual researchers and groups

Example citations can be seen at the DOI Citation Formatter:

  • http://crosscite.org/citeproc/

    enter the portion after “doi:” or “dx.doi.org/” e.g. 10.5065/D60Z7187

Physical Object Examples (such as LAOFs)

As of the February 2015, the only DOIs with the resourceType metadata value “PhysicalObject” registered with DataCite/EZID belonged to EOL. However, CDL, LTER, NEON, CA Natural Reserves, to name a few, are thinking about how to cite physical objects/places as well, see this post is about field stations. They are looking for feedback, looking to partner, and hope to implement February 2015. This is an area of active community development.

Images