G1.08 Evaluation of the effect of missing data or missing temporal coverage of fully traceable data provided by ground-based networks
Gap detailed description
Missing data are a common problem for geophysical data sets. For instrumental data sets obtained currently, the uneven spatio-temporal coverage arises for myriad reasons, depending on the type of instrumentation. For example, remote sensing is influenced by atmospheric conditions and can be hampered by clouds, aerosols, heavy precipitation, or extreme weather conditions. Alternatively, instrumentation may be limited to night-time or to periods when relevant staff are on-site or by similar factors.
Missing data are, in particular, a source of problems in climate research, e.g., in the analysis and modelling of spatio-temporal variability. This is particularly so when the missing data is not entirely random such that there may arise a geophysical difference between the measured period and the potential fully sampled period. Analyzing the full extent of the climate time series, with the missing points filled in, allows for greater accuracy and better significance testing in the spectral analysis. The full record can also improve our knowledge of the evolution of the oscillatory modes in the gaps, and provide new information on changes in climate. Spatio-temporal filling techniques have been developed (Kondrashov et al., 2006) but there are only a few efforts at quantification of the effect of temporal sampling in the determination of atmospheric variability. This prevents full traceability of both the model/assimilation quantity and also the observational dataset.
Activities within GAIA-CLIM related to this gap
GAIA-CLIM will initiate limited work relevant to address this gap within Task 1.4, but more research is needed to fully address this topic.
Gap remedy(s)
Remedy
Specific remedy proposed
The use of geo-statistical approaches for time series with missing data allows the evaluation of the effect of missing data or missing temporal coverage to assess this effect. Research should characterize model-observation differences with focus on enhancing representation of “observation operators”.
Measurable outcome of success
Tests using complete records and simulated incomplete records obtained from the complete ones can show the validity of the approach. This should also build confidence in the applicability of the approach for climate studies.
Achievable outcomes
Technological / organizational viability: medium, challenges are related to the scarce literature available on this topic.
Indicative cost estimate: low (<1 million)
Relevance
Robust geo-statistical approaches are needed in order to analyze timeseries affected by the problem of missing data; their misuse or their poor exploitation will lead to sub-optimal usage and incorrect decision making.
Timebound
GAIA-CLIM will provide work relevant to address this gap by the end of the project (Feb. 2018). A complete remedy depends on future studies and projects workign on it.
Gap risks to non-resolution
Identified future risk / impact |
Probability of occurrence if gap not remedied |
Downstream impacts on ability to deliver high quality services to science / industry / society |
Less efficient exploitation of the available data with impact on the design of global observing system and climate monitoring. |
High |
Under-exploitation of datasets in presence of missing -data or temporal gaps.
|
Reduced potential of using observational data for addressing current and future science questions |
Medium |
Under-exploitation of datasets collected using technologies considered obsolete or superseded |