G1.08

G1.08 Evaluation of the effect of missing data or missing temporal coverage of fully traceable data provided by ground-based networks

Gap detailed description

Missing data are a common problem for geophysical data sets. For instrumental data sets obtained currently, the uneven spatio-temporal coverage arises for myriad reasons,  depending on the type of instrumentation. For example, remote sensing is influenced by atmospheric conditions and can be hampered by clouds, aerosols, heavy precipitation, or extreme weather conditions. Alternatively, instrumentation may be limited to night-time or to periods when relevant staff are on-site or by similar factors.

Missing data are, in particular, a source of problems in climate research, e.g., in the analysis and modelling of spatio-temporal variability. This is particularly so when the missing data is not entirely random such that there may arise a geophysical difference between the measured period and the potential fully sampled period. Analyzing the full extent of the climate time series, with the missing points filled in, allows for greater accuracy and better significance testing in the spectral analysis. The full record can also improve our knowledge of the evolution of the oscillatory modes in the gaps, and provide new information on changes in climate. Spatio-temporal filling techniques have been developed  (Kondrashov  et  al.,  2006) but there are only a few efforts at quantification of the effect of temporal sampling in the determination of atmospheric variability. This prevents full traceability of both the model/assimilation quantity and also the observational dataset.

Activities within GAIA-CLIM related to this gap

GAIA-CLIM will initiate limited work relevant to address this gap within Task 1.4, but more research is needed to fully address this topic.

Gap remedy(s)

Remedy

Specific remedy proposed

The use of geo-statistical approaches for time series with missing data allows the evaluation of the effect of missing data or missing temporal coverage to assess this effect. Research should characterize model-observation differences with focus on enhancing representation of “observation  operators”. 

Measurable outcome of success

Tests using complete records and simulated incomplete records obtained from the complete ones can show the validity of the approach. This should also build confidence  in the applicability of the approach for climate studies.

Achievable outcomes

Technological / organizational viability: medium, challenges are related to the scarce literature available on this topic.

Indicative cost estimate: low (<1 million)

Relevance

Robust geo-statistical approaches are needed in order to analyze  timeseries affected by the problem of  missing data; their misuse or their poor exploitation will lead to sub-optimal usage and incorrect decision making.

Timebound

GAIA-CLIM will provide work relevant to address this gap by the end of the project (Feb. 2018). A complete remedy depends on future studies and projects workign on it.

 

Gap risks to non-resolution

 

Identified future risk / impact

Probability of occurrence if gap not remedied

Downstream impacts on ability to deliver high quality services to science / industry / society

Less efficient exploitation of the available data with impact on the design of global observing system and climate monitoring.

High

Under-exploitation of datasets in presence of missing -data or temporal  gaps.

 

Reduced potential of using observational data for addressing current and future science questions

Medium

Under-exploitation of datasets collected using technologies considered obsolete or superseded

 

Work package: 
WP6