The Scientific Quality Assurance (SQA)#
Main Processes#
The main processes of a SQA are:- The researcher performs and documents the scientific quality assurance (SQA)
- The researcher checks, changes and augments the metadata (SQA)
Quality assurance of the content of observational data is mainly in the responsibility of the data authors. This also have to be documented in the metadata. The assigned quality flag in the database system is 'approved by author'.
Important aspects#
Important aspects of observational data checks are:- Granularity of SQA data checks and documentation
- The integration of the SQA documentation into WDC-Climate
- Which credit can be assigned for the persons that performs the data checks?
- Additional descriptive metadata for observational data
Granularity of SQA data checks and documentation#
On the one hand it should be as detailed as possible so that the complete quality assurance is understandable and repeatable by someone, who got only the documentation of the QA and the data. On the other hand we acknowledge the problem, that a SQA, which needs to much effort to perform, wouldn't be filled propperly by the most users.An ideal SQA#
The ideal case of conditions for SQA is:Information of the quality of data should be available for every dataset. The quality checks have to be performed on the whole datasets and not parts of it. The complete shape of a netCDF-file has to be checked. Each quality check on its own needs a complete documentation with the algorithm, the parameters which are used, information on the date of the performance, perhaps the result, and most important a comment of the user, what he thinks about the results.
Too much information: Our solution#
This would lead to a lot of information, which have to be gathered from the user. This brings us to the point, when a concept around a R-Package was constructed, which should assist the user. This package consists of several tests, which are connected in a framework, that allow the user to use it and perform a reasonable SQA, without much knowledge of R. R itself got a lot of advantages, because it is free, open source and platform independent. It also brings us the possibility to use a huge amount of statistical functionality, which are already available for this language. To bring information from the users computer to our web based software, we agreed, that there is a need for an exchange formate, which we defined afterwards in XML.The actual focus of work in this field lay on the developement of the R-Package, which is usable as a Quality Assurance Toolkit. The first version uses quality checks described by Meek and Hatfield (1994), which are simple tests on limits and changes of data-values. With this the technical basement of the R-Package will be developed and further checking methods are examined. A first publication of the package on the central repository of R-extensions CRAN (http://cran.r-project.org/
) is available since April 2011.
Additional descriptive metadata for observational data#
Additional descriptive metadata are integrated into the WDC-Climate:Instrument and Platform#
The instrumente und platform metadata exist of 3 levels. The level structure and list of values are adopted from the NASA DIF. (http://gcmd.nasa.gov/User/difguide/source_name.html
and http://gcmd.nasa.gov/User/difguide/sensor_name.html
)
Quality Level#
The quality level is adopted from Global Ocean Data Assimilation Experiment (GDAE http://www.godae.org/Data-definition.html
)



