Data Ingestion Portal

How can you join the NRT data exchange

Organisations that are operating monitoring stations and that are not yet connected are invited to make their NRT data streams part of the NRT data exchange and available at the EMODnet Physics portal.

The following cases can exist:

  • If the data provider can set up the data flow according the defined standards, the regional coordinator only has to link and include the new catalogue and data stream
  • If the data provider cannot setup the data flow (because of lack of experience, technical capacity etc), the regional coordinator has to work on harvesting the data from the provider, harmonize and format these data and make them available from the regional catalogue.

At regional level according to the platform type and parameters the following principles will apply:

Data acquisition:
Data are collected through direct links with the institutions

  • Direct connection is established (usually through ftp protocol) between the RAC and the data provider
  • Information is provided about the required metadata that should be supplied together with data (ex. station position, date, frequency of measurement, platform name, depth of each sensor, contact person, PI, etc.)
  • Guidance is also provided on how the required daily and monthly files should be created.
  • Information exchanged about the QC procedures
  • Data are provided in the originator’s native format, no need for conversion to NetCDF. This procedure is performed by the RDAC staff

Data are converted in a unique format (netcdf)
Extended guidance for new partners in order to provide all the necessary information

Quality Control:

  • Apply automatic quality control procedures on each parameter, elaborated in coherence with international agreement (in particular SeaDataNet).
  • Procedures applied after agreement with the data originators in order to avoid conflicts and effort duplications.

Validation/Assessment:

  • Assess the consistency of the data over a period of time in an area. The aim is to detect possible incoherencies with nearby data  that could not be detected by automatic QC.

In common practice connecting a new data provider will involve the following steps:

  1. Meeting with data producer (PU) to present data flow, infrastructures, common standards, vocabularies (usually the meeting is joined by an EMODnet Physics, a EuroGOOS and an RDAC representative)
  2. Identification of PU technical capabilities and needs
  3. Identification and collection of the required metadata that should be supplied together with data (ex. station position, date, frequency of measurement, platform name, depth of each sensor, contact person, PI, etc.) according harmonized vocabularies (SeaDataNet, EDMO for institution, unique platform name (e.g. WMO number, ICES platform Id, etc)
  4. Setting up of a permanent data collection channel (the most used is ftp protocol for fixed stations and drifting buoys or ARGO, THREDDS for HFRadar) between the PU and RDAC
  5. If needed, guidance on how the required daily and monthly files should be created, anyhow data are provided in the originator’s native format, no need for conversion to NetCDF. This procedure is performed by EMODnet Physics and RDACs staff.
  6. Information exchange about the QC procedures.
  7. Quality control. QC procedures are applied after agreement with the data originators in order to avoid conflicts and effort duplications. Quality control procedures are automatic and are applied on each parameter. These procedures have been defined by the EU MyOcean project, adopted by EuroGOOS and documented in a EuroGOOS DATAMEQ report.
  8. RDAC procedure for data indexing are updated and data are stored in a ftp repository (folders to separate latest data from older data and to split operational data from research opportunity data are used)
  9. Routinely (three times a day), the EMODnet Physics collects new data files from all RDACs and make them available for discovery, pre-viewing, download (NetCDF and ASCII csv), and machine-to-machine interoperability (WMS, WFS and web services).
  10. Data flow monitoring. EMODnet Physics applies both automatic data flow monitoring (controls if data is available, if any connection/data flow failure occurred, etc.) and periodic manual controls.
  11. Periodically RDAC, INSTAC, and EMODnet Physics assess the consistency of the data in order to identify possible incoherencies in both data and metadata and dataflow. In case any actor of the pipeline identifies an error, this notification goes both downstream and upstream in order to track it and let the right actor to correct it.

Note: In addition arrangements can be made with a SeaDataNet data centre for further validation of the collected datasets and inclusion in the data management infrastructure for long term stewardship. Alternatively the data provider can decide to ingest the datasets by means of the Data Submission service at this portal whereby it will be received by a SeaDataNet data centre for further processing.