G1.06 Lack of a common effort in metadata harmonization
Gap detailed description
Metadata is an increasingly central tool in the current web environment, enabling large-scale, distributed management of resources. Recent years have seen a growth in interaction between previously relatively isolated communities, driven by a need for cross-domain collaboration and exchange. However, metadata standards have not been able to meet the needs of interoperability between independent standardization communities. Observations without metadata are of very limited use: it is only when accompanied by adequate metadata (data describing the data) that the full potential of the observations can be realized. Several efforts have been undertaken to improve the harmonization of metadata across the networks and international programs, but this is still not sufficient. Harmonization effort in the atmospheric science community is related to the WIGOS (https://www.wmo.int/wigos) standard, currently under development and subsequent implementation at the WMO, and by the ESA Climate Change Initiative (CCI [1]).
Activities within GAIA-CLIM related to this gap
An attempt to address this gap within GAIA-CLIM will be undertaken as part of Task 1.2 whereby metadata will be collected and then delivered in a useable set of formats for a selected set of networks which may plausibly contribute to the Virtual Observatory activity.
GAIA-CLIM task 1.2, deliverable D1.7 – Report on the collection of metadata from existing network and on the proposed protocol for a common metadata format (CNR; M18).
Gap remedy(s)
Remedy
Specific remedy proposed
Within GAIA-CLIM Task 1.2 / WP5 and, in synergy with WIGOS and ESA-CCI representatives, we will provide a unified metadata format (UMDF) that will aim to extend and integrate additional important elements into the WIGOS format (while keeping the various formats interoperable). After selecting relevant initial candidate metadata formats (ISO, WIGOS, ESA-CCI), the UMDF will be developed. The UMDF shall hold relevant parameters common to all original data formats. In addition, it will preserve the «original MD» as an additional field «original xml» at the end of each database document in a MD collection, thus preserving the ability to export in the original data formats upon request if needed.
Measurable outcome of success
Use of the proposed UMDF within GAIA-CLIM and by downstream users of the Virtual Observatory.
Achievable outcomes
Technological / organizational viability: medium, this is a demonstration activity but not extremely challenging to be implemented in an efficient way.
Indicative cost estimate: low (<1 million)
Relevance
The proposed UMDF is a significant attempt to improve the metadata harmonization at the international level and its benefit may be expected to be large and affecting many (primarily expert) data users.
Timebound
This UMDF will be defined by September 2016 (deliverable D1.7) and finalized by end of the project within WP5. It will build up from the dialogue established with WIGOS and ESA-CCI (making use of CF convention). The GAIA-CLIM metadata will be used for all the reviewed networks, but also for the all the data records that will be available on the GAIA-CLIM virtual observatory (WP5).
A final version of the UMDF will be finally released and implemented by the end of the GAIA-CLIM project (Feb. 2018).
Gap risks to non-resolution.
Identified future risk / impact |
Probability of occurrence if gap not remedied |
Downstream impacts on ability to deliver high quality services to science / industry / society |
Missing interoperability between independent metadata standardization communities |
Medium |
Limited cross-domain collaboration and exchange for the users. Limits the ability to appropriately use and derive value from the data |