Assistant System (Objectives)#

In the course of the project we develop a web-based software system which supports the publication process for environmental data. Essentially, the system will provide the following functionality:
  • Select an experiment to publish
  • Review, change and add metadata
  • Release metadata and primary data for publication
  • Double-check of metadata by publication agent and technical quality assurance
  • Register DOI and URN by calling DataCite web services

The assistant system has been named Atarrabi (Basque: good weather spirit).

To show you the objectives of our project in this field, we divided this page into three subsections:

Overview#

The Atarrabi system should provide a key part of the STD-DOI Publication and Citation of Scientific Primary Data publication process at World Data Center for Climate (WDCC). The procedure of publishing data is divided into four basic steps:
  1. Giving a scientist a special permission to publish an entity
  2. Execution of a Scientific Quality Assurance (SQA) on data and metadata, which should be controlled by the publication agent
  3. Execution and Controlling of a Technical Quality Assurance (TQA) of the data with double-checks of metadata at the WDCC by the publication agent
  4. Preparation and registration of the data and metadata for the publication at TIB with DOI and URN assignment


After publication the data will be protected against changes. Two persistent identifiers (DOI Digital Object Identifier, URN Unified Resource Name) are assigned, which can be used for data citation for the credit of the data originator as well as to provide provenance information for a derived data product.
Preconditions to start the WDCC STD-DOI publication process are:

  • a personal CERA account
  • long term availability of primary data at WDCC
  • long term availability of metadata at WDCC
  • open access to primary data and metadata

Switch from TIB to DataCite#

During the project the data registration moved from STD-DOI (TIB, German National Library of Science and Technology) to DataCite. In contrast to the TIB or California Digital Library the DataCite consortium is a registration agency of the IDF (International DOI Foundation). Since DataCite is no funded institution, its members are the national institutions who function as registration agencies. Thus, WDCC in Germany has a contract with the German National Library of Science and Technology (TIB), which is our registration agency. For a US university or research institute the (contract) partner would be a US registration agency (member of dataCite) like the California Digital Library you suggested.

System context diagram#

The following context diagram shows the involved parties and the basic procedures of the publication process, which should be governed by the Atarrabi system:
  1. The system reads all necessary metadata of a meteorological entity to publish from the CERA2 database at WDCC Hamburg. These data and some other lists (like persons, institutes and locations) are imported in the Atarrabi system and are used to pre-fill the fields.
  2. The researcher thoroughly checks, changes and augments the metadata and performs and documents the scientific quality assurance (SQA). Correctness and completeness of metadata are an important factor for correct registration in catalogues and search engines. In contrast to the traditional approach of presenting all metadata fields on one single page, we decided to split the numerous metadata fields into several logical units and present them page by page. This is generally known from install wizards. If applicable we introduce list of values and controlled vocabularies to ease and unify inputs. We use a map view to visualize geospatial coordinates. The researcher is able to leave the assistant anytime and come back later and continue the process. Learn more...
  3. A publication agent at WDCC Hamburg double-checks the metadata and performs the technical quality assurance (TQA).
  4. If all checks are passed successfully, the metadata are sent to the DataCite registration service which will register the Digital Object Identifier (DOI) and Uniform Resource Name (URN). The changed metadata are in addition saved in the CERA2 database.
«
Cite this version:  10490/inf.ah.wikidora-345520521