Elsevier

Computers & Geosciences

Volume 28, Issue 7, August 2002, Pages 789-798
Computers & Geosciences

Management of (pale-)oceanographic data sets using the PANGAEA information system: the SINOPS example

https://doi.org/10.1016/S0098-3004(01)00112-1Get rights and content

Abstract

During the SINOPS project, an optimal state of the art simulation of the marine silicon cycle is attempted employing a biogeochemical ocean general circulation model (BOGCM) through three particular time steps relevant for global (paleo-) climate. In order to tune the model optimally, results of the simulations are compared to a comprehensive data set of ‘real’ observations. SINOPS’ scientific data management ensures that data structure becomes homogeneous throughout the project. Practical work routine comprises systematic progress from data acquisition, through preparation, processing, quality check and archiving, up to the presentation of data to the scientific community. Meta-information and analytical data are mapped by an n-dimensional catalogue in order to itemize the analytical value and to serve as an unambiguous identifier. In practice, data management is carried out by means of the online-accessible information system PANGAEA, which offers a tool set comprising a data warehouse, Graphical Information System (GIS), 2-D plot, cross-section plot, etc. and whose multidimensional data model promotes scientific data mining. Besides scientific and technical aspects, this alliance between scientific project team and data management crew serves to integrate the participants and allows them to gain mutual respect and appreciation.

Introduction

Marine data management evolved from descriptive cataloguing to a relational digital record of climate expertise spanning a period of about one century. After the famous British H.M.S. Challenger deep-sea expedition (1872–1876) returned, an international team of investigators analyzed the staggering body of observations and converted them into records of qualitative and quantitative data. By 1895 the study was completed and a 50-volume report published (e.g., Murray and Renard, 1891). Since the middle of the last century, large and heterogeneous numeric data loads were generated during marine large-scale projects (e.g., DSDP, WOCE, JGOFS). At this time, a data management strategy termed ‘the box of floppies’ approach was developed. The data sets were supplied to the data center as discrete entities (usually on floppy disks) where they were checked, catalogued and stored. On demand, clients were supplied with the data sets necessary to satisfy their requirements. Data management philosophy in this scenario was firmly focussed on data archiving (Lowry, in press). Today, the challenge of data management is to provide standardized import and export routines to support the scientific community with convenient and uniform retrieval functions and efficient tools for the graphical visualization of their analytical data and meta-data (e.g., Dittert et al., 2001b).

Parallel to the evolution of data acquisition and data management, progress in modeling the oceanic silicon cycle has been made. Until recently, model approaches to reproduce the distribution of biogenic silica in marine sediments were limited to one-dimensional (1-D) studies or simple reservoir models (e.g., Boudreau, 1990; Wong and Grosch, 1978). But the available 1-D biogenic silica and multi-component sediment models cannot reproduce the regionally charging biogenic silica sediment distribution satisfactorily with simple parameterizations of early diagenesis processes (e.g., Archer et al., 1993). Lately, a coupled 3-D ocean-sediment model was developed that includes biogenic silica on the basis of a biogeochemical ocean general circulation model (BOGCM) velocity field which allows good regional resolution of large scale oceanic tracer fields (Heinze et al., 1999). In order to forecast realistic future scenarios, however, the sensitivity of such a model has to reproduce plausible past scenarios. In turn, the validity of this reproduction depends on the quality of the observational data sets. During SINOPS (“silicon cycling in the world ocean: the controls for opal preservation in the sediment as derived from observations and modeling”), scientific data management links the demands on sensitive climate models to high-quality observational data.

This publication demonstrates how (pale-)oceanographic observational data were managed to become a useful, innovative tool for modelers, which is presented as the SINOPS example in Section 2. There, we give an outline of SINOPS’ scientific approach and describe the requirements on appropriate data sets. Section 3 illustrates the prerequisite to compile data sets by means of the information system PANGAEA (“Network for Geological and Environmental Data”). Section 4 describes in detail how data management is accomplished during SINOPS demonstrating the work routines from data acquisition, through preparation, processing, quality check and archiving, up to the presentation of data to the scientific community. In the conclusion problems and prospects of projects like SINOPS are discussed.

Section snippets

SINOPS’ scientific approach

Next to water vapor, CO2 is the second most important greenhouse gas. Records from the South Pole and Mauna Loa, Hawaii, show a 50 ppm CO2 increase since 19581 which is attributed to fossil fuel emissions and other human activities (e.g., Keeling et al., 1996). Besides these anthropogenic short term variations there are long term atmospheric CO2 variations as recorded by readings of gas bubbles trapped in polar ice

Prerequisites to compile SINOPS suitable data sets

Given these requirements on appropriate data sets, a minimum data management profile can be defined: We seek an information system that (1) represents our n-dimensional parameter-catalogue and the accompanying meta-information catalogue by a suitable data model; (2) archives the inhomogeneous data collection in a way that any ‘datum’ is (I) described at any stage thoroughly and is (II) traced back to its origin in order to protect copyright. Simultaneously, (3) all interfaces must be

Applied data management during SINOPS

In order to illustrate how data management can be accomplished using PANGAEA, practical work routines are explained with respect to the following questions: What data have been available prior to SINOPS? What kind of data and how were data acquired during SINOPS? How much effort has to be invested in order to prepare, process and archive single data series using PANGAEA? Can data quality be checked thoroughly? And finally, what are the different ways to present SINOPS data?

Discussion and conclusion

Increasingly, proper data management has gained importance in all domains of scientific research. During the SINOPS project, a biogeochemical ocean general circulation model and extensive and high-quality data collection allow for comprehensive studies of the silicon cycle in the world ocean. Climate modeling is carried out by HAMOCC2s. Data are mapped adopting the n-dimensional parameter-catalogue approach which is converted to a multidimensional data model. In practice, data management is

Acknowledgments

The authors wish to express their sincere appreciation for successful cooperation to all principal investigators during SINOPS, the PANGAEA data management group at AWI, Bremerhaven, and the WDC-MARE at MARUM, Bremen. We acknowledge comments from M.-A. Gutscher, C.-D. Hillenbrand and an anonymous reviewer. This research was supported by the European Commission (MAS3-CT97–0141 “SINOPS”).

References (26)

  • D. Archer et al.

    What controls opal preservation in tropical deep-sea sediments?

    Paleoceanography

    (1993)
  • D. Archer et al.

    What caused the glacial/interglacial atmospheric PCO2 cycles?

    Reviews of Geophysics

    (2000)
  • E. Bard

    Ice age temperatures and geochemistry

    Science

    (1999)
  • J.M. Barnola et al.

    CO2 evolution during the last millennium as recorded in Antarctic and Greenland ice

    Tellus

    (1995)
  • W.A. Berggren et al.

    A revised Cenozoic geochronology and chronostratigraphy

  • B.P. Boudreau

    Asymptotic forms and solutions of the model for silica-opal diagenesis in bioturbated sediments

    Journal of Geophysical Research

    (1990)
  • Conkright, M.E., Levitus, S., Boyer, T.P., 1994. World Ocean Atlas 1994, Vol. 1: Nutrients. NOAA Atlas NESDIS, 1, US...
  • R. Dalton

    Young, wordly and unhelpful all miss out on data sharing

    Nature

    (2000)
  • D.J. DeMaster et al.

    Preservation efficiencies and accumulation rates for biogenic silica and organic C, N, and P in high-latitude sedimentsthe Ross sea

    Journal of Geophysical Research

    (1996)
  • M. Diepenbroek et al.

    Data management of proxy parameters with PANGAEA

  • Dittert, N., Diepenbroek, M., Grobe, H., 2001a. Scientific data must be made available to all. Nature, 414,...
  • N. Dittert et al.

    Hunting and gathering silicon data to tackle climate forecasting

    EOS Transactions, American Geophysical Union

    (2001)
  • Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R., 1996. Advances in knowledge discovery and data mining....
  • Cited by (4)

    System available at http://www.pangaea.de

    View full text