|Data Quality Control Links|
Data Quality Control Applications.
If you have any questions, please contact Ben Weiger or Matt Strahan.
We want to share with you information about various data quality control applications tools available in AWIPS.
1. There is an AWIPS local application called SHEF_CHEK, that checks for SHEF syntax errors in incoming SHEF-encoded products. This application and its associated documentation are available on the AWIPS Local Application Database web site. The URL for this web site is:
2. The LDAD Quality Control and Monitoring System (QCMS) currently provides data quality control checking for certain hydrometeorological parameters contained in local meso-networks, ASOS observations, automated METAR observations from non-ASOS sources, manual METAR observations, buoy reports, and the NOAA Profiler network. Further information about the LDAD QCMS is contained at the following web sites:
(the link above provides an excellent overview about the LDAD QCMS)
(the website above is the official AWIPS LDAD System Manager Manual written by FSL)
(these links are to various sections of information about the LDAD QCMS. Section 8.4 provides you with the PILs to use to access the hourly, weekly, daily, and monthly QCMS text products on AWIPS)
3. There is also some data QC tools within the WFO Hydrologic Forecast System. In Hydrobase, there is a range check window that you can access via the root window by clicking on Data Ingest, Range Check. The default range checks can be changed for individual stations and SHEF physical element codes via Hydrobase. Data that falls outside these limits can be displayed via HYDROVIEW using the LIVE DATA menu bar and selecting "Out-of-Range Data". Documentation about current WHFS data QC functionality is contained in the WHFS Users Guide. WHFS Version 3.0 in AWIPS Build 5.0 will provide additional flexibility and features associated with data QC. Further information about the data QC features in the Build 5.0 version of WHFS are contained in release notes available on the OH's WHFS support group web site. The URL for the website is:
http://www.nws.noaa.gov/oh/hod_whfs/documentation/whfs30_bld50rn.pdf4. The information below from FSL contains some updated documentation that reflects the modifications made to the AWIPS LDAD data quality control (QC) tools in AWIPS Build 5.0.
Note: Included are sections from the AWIPS Users Guide on the QCMS and MSAS systems. Both sections are from Build 5.0. The differences between 4.3 and 5.0 are as follows:********************* Sections from the AWIPS Users Guide *************************
QCMS - Build 5.0 includes SHEF encoding; Build 4.3 does not. (Further information about this subject will be included in OH's detailed document about Build 5.0 WHFS data QC tools)
MSAS - Build 5.0 includes extra analyzed (NWS SLP and altimeter) and derived (potential temperature advection and equivalent potential temperature advection) fields. Also, Build 5.0 has the ability to display the (QCed) observations used in each MSAS analysis (see "Accessing MSAS Observations" section below).
10. The AWIPS Observation Quality Control and Monitoring System
NOTE: The AWIPS QCMS deals with both NOAAPORT data and data from the LDAD System.
10.1 Overview of QCMS
The AWIPS Observation Quality Control and Monitoring System (QCMS) is being developed to supply forecasters with readily-available quality control information and statistics. Two types of quality control checks are considered: static checks, which are single-station and single-time checks, such as internal consistency checks and validity checks; and dynamic checks, which take advantage of other hydrometeorological information, such as temporal and spatial consistency checks.
Other requirements for the QCMS include the use of "data descriptors" which give an overall rating of the quality of each observation, development of a QC database for storage of QC results, and the ability for forecasters to override objective QC decisions.
The current QCMS system is a partial implementation of the requirements for AWIPS quality control procedures. The implementation includes both subhourly and hourly QC processing. The subhourly processing consists of the application of validity, internal consistency, and temporal consistency checks to LDAD mesonet observations of sea-level pressure, temperature, dewpoint temperature, wind, station pressure, altimeter setting, pressure change, relative humidity, visibility, and precipitation observations. The hourly processing consists of the application of validity, internal consistency, temporal consistency, and spatial consistency checks to LDAD mesonet and NOAAPORT observations of sea-level pressure, temperature, wind, and dewpoint temperature.
With the subhourly processing, the QCMS checks every 5 minutes for newly arrived observations. Observations not previously checked are then immediately quality controlled. The QCMS also calculates hourly, daily, weekly, and monthly statistics on the frequency and magnitude of the observational errors encountered for sea-level pressure, temperature, dewpoint, and surface winds.
Enhancements are under development for future versions of the QCMS system.
10.1.1 QCMS Automated Checks
QCMS automated quality control (QC) procedures consist of validity, temporal consistency, internal consistency, and spatial consistency checks.
The validity checks restrict each observation to falling within a specified set of tolerance limits. The temporal consistency checks restrict the temporal rate-of-change of observations at each station to a set of (other) specified tolerance limits. In both cases, observations not falling within the limits are flagged as failing the respective QC check.
The internal consistency checks enforce reasonable, meteorological relationships among observations measured at a single station. For example, a dewpoint temperature observation must not exceed the temperature observation made at the same station. If it does, both the dewpoint and temperature observation are flagged as failing the internal consistency check. Pressure internal consistency checks include a comparison of pressure change observations at each station with the difference of the current station pressure and the station pressure three hours previous, and a comparison of the reported sea-level pressure with a sea-level pressure estimated from the station pressure and the 12 hour mean surface temperature. In the former check, if the reported 3h pressure change observation does not match the calculated ob, then only the reported observation is flagged as bad. In the latter check, however, if the reported sea-level pressure does not match the calculated ob, then both the sea-level and station pressure obs are flagged as failing.
The spatial consistency checks compare observations to values estimated from neighboring data using meteorological analysis techniques. The error threshold, to which the absolute value of the difference between estimated and observed values is compared, is a function of the expected analysis error. This helps account for differences in observed and estimated values, which may be acceptable due to estimation errors. The threshold also takes into account the distance of the surrounding stations, as well as the differences in elevation.
10.1.2 QCMS Subjective Intervention
Two text files, a "reject" and an "accept" list, are provided to allow the site to override the results of the automated QC checks. The reject list is a list of stations and associated input observations that are labeled as bad, regardless of the outcome of the QC checks; the accept list is the corresponding list for stations that are labeled as good, regardless of the outcome of the QC checks. Applications reading the lists (e.g., MSAS) reject or accept the stations specified. In both cases, observations associated with the stations in the lists can be either flagged individually or in groups.
Note that the QCMS statistical procedures (and summary files) are not affected by the intervention lists. This allows you to continue to monitor the performance of the stations contained in the reject and accept lists. For example, you may notice a station with wind observations that fail the QC checks a large percentage of the time, and choose to have that station added to the reject list. However, once the observation failure rate at the station falls back to near zero (possibly due to an anemometer repair), you can recommend that the station be deleted from the list.
The Science and Operations Officer (SOO), ESA, or other focal point can change entries in the reject and accept lists. Information on how to edit these lists is contained in the System Manager's Guide.
10.1.3 QCMS Observation Files
In addition to the output described in Subsection 10.2, the QCMS writes netCDF and comma-separated-value (CSV) observation files for use by AWIPS applications programs. The netCDF files contain raw observations and the results of the automated and subjective QC procedures. Also included are single-character "data descriptors." These are data structures intended to define an overall opinion of the quality of each observation by combining the information from the various QC checks.
Table * provides a complete
list of the netCDF data descriptors.
|Data Descriptor Definitions|
|Preliminary (Z)||No QC Applied|
|Coarse Pass (C)||Passed stage 1|
|Screened (S)||Passed stages 1 & 2|
|Verified (V)||Passed stages 1, 2, & 3|
|Erroneous (X)||Failed stage 1|
|Questionable (Q)||Passed stage 1, but 3 failed stages 2 or 3|
|Subjective Good (G)||Included in accept list|
|Subjective Bad (B)||Included in reject list|
Table *. NetCDF data descriptor definitions. Stage 1 QC consists of observation validity checks; stage 2, temporal and internal consistency checks; and stage 3 spatial consistency checks.
Raw observations and data
descriptors are also included in the CSV files, which are used as input
to the LDAD SHEF encoder. The SHEF descriptors relate to the netCDF descriptors
|Z - no QC||Z|
|X - failed stage 1||R|
|Q - passed stage 1, failed 2 or 3||Q|
|C - passed stage 1||S|
|S - passed stages 1 and 2||V|
|V - passed stages 1, 2, and 3||P|
|G - subjective override - good||G|
|B - subjective override - bad||B|
10.2 QCMS Text Output and Displays
10.2.1 Accessing QCMS Summary Files
You can access the text QCMS files via the Text Display (refer to Section 4 on use of the Text Display). The nine-character descriptor name for QC messages is as follows:
Here, "CCC" is the number assigned to the data provider. For example, the first five numbers are assigned to national data sets:
001 = SAO (METAR manual)
002 = Buoy
003 = NPN (NOAA Profiler Network)
004 = AUTO (automated, non-ASOS)
005 = ASOS
Slots 006 - 020 are assigned to local data networks, ingested into the LDAD system. For example, the NWS office in Denver currently has two local data sets:
006 = Colorado Department
007 = ALERT Weather
The "NNN" identifies the desired QC summary file, for example:
QCH = hourly
QCD = daily
QCW = weekly
QCM = monthly (4 week)
Finally, the "XXX" is the three-character NWS field office name, such as SLC, SEA, DEN, OKC, etc. For example, in the entry box in a Text Window, the descriptor "003QCDSLC" generates a daily summary of NPN quality control statistics.
QC statistics for national data sets, such as the NPN, are generated at each WFO, as are the local data sets. Statistics are maintained separately for each given network, such as ASOS, METARs, buoy, etc. In addition, ASOS network statistics are subdivided by individual NWS Regions.
10.2.2 QCMS Summary File Descriptions
As previously mentioned, the QCMS collects statistics on observational errors of sea-level pressure, temperature, wind, and dewpoint. Hourly, daily, weekly, and monthly summaries are then made available. The following statistics are generated, although some are not available in every summary, as noted:
* Total number of observations
for each variable;
* Number of observations which failed any QC check; station identifiers of failed observations;
* Error and threshold values (from the spatial consistency check) for each failed observation (hourly only);
* Root-mean-square (RMS) error/mean error/percentage failure for failed observations at each station (daily/weekly/monthly).
Time interval of summary data, shown in upper left corner of page.
* SLP (MB) - Mean-sea level
pressure, in millibars.
* POT TEMP (DEG F) - Potential temperature, in degrees Fahrenheit.
* DEW PNT (DEG F) - Dewpoint temperature, in degrees Fahrenheit.
* DD (DEG) - Wind direction, in degrees.
* FF (KNTS) - Wind speed, in knots.
Number of observations for
each variable for 00 UTC 22 January 1998 for the ASOS network. Statistics
are calculated for the entire
country, but are grouped by region.
Number of questionable (failed) observations for each variable.
Percentage of failure for each variable.
Name of failed stations, given in column 1. These names vary depending on the data set.
Amount of error (defined as QC estimation minus observation) for each variable.
The difference allowed between the estimated and observed values, given in parentheses.
Root-mean-square error of failed observations for each station during the prescribed time period.
Mean error of failed observations for each station during the prescribed time period.
Percentage of failed observations, or failure rate, for each station for the prescribed time period.
Daily, weekly, and monthly summaries include only those stations with observations that have failed more than 25% of the time.
*Module 30: Working with QCMS Summary Files
Objective 1 - Retrieve a Weekly QC Summary File of METAR Manual Data from a Local WFO
This objective describes how to display a weekly summary of QC statistics from METAR manual data.
1. In a Text Window, type
"001QCWXXX," where XXX is your local site ID. Observe the statistical summary
as it is displayed.
2. Determine the overall failure rate for each of the five variables. These are located in the "PERCENT QST" row.
3. Examine the stations in the report and identify the RMS error, mean error, and failure rate for different variables.
4. Familiarize yourself with the other statistics in this report.
5. Try loading other monthly, weekly, daily, or hourly reports.
10.2.4 QCMS LDAD Mesonet Displays
In addition to the text QC output, AWIPS contains the ability to display LDAD QC information along with the raw Mesonet observations. (Mesonet data includes observations from the Department of Transportation, Alert Weather, RAWS, cooperative schools, and other cooperative participants. (These observations will vary among forecast offices.) The QC displays consist of color-coded station plots. Stations with observations found bad by the QCMS are distinctly colored to indicate possible problems with their reported data. Pointing and clicking on any station invokes the display of a small QC table indicating which QC checks have been applied at the time of the display, which ones have been passed, and which ones have been failed. Plots are automatically updated as new data arrives and is quality controlled.
Blanks in the table indicate that the associated QC check was not applied. In cases where the check was applied, the observation either passed (P) or failed (F) the automated checks, or was labeled good (G) or bad (B) through the subjective intervention procedures.
To access the QC plots, you
can be on the Regional, State(s), or WFO Scales. Then select the Local
Data - Other Plots Cascading Menu under the Surface Pull-Down Menu.
2. The AWIPS Workstation
2.1.6 The Menu Bar
MSAS Surface Analysis
The Mesoscale Analysis and Prediction System (MAPS) Surface Assimilation System (MSAS) was built to exploit the spatial density and temporal frequency of surface data by providing timely and detailed surface analyses. It currently provides hourly analyses on a 40-km grid covering the 48 contiguous States and neighboring areas of Canada and Mexico and uses persistence (the previous hourly analysis) as the background for the current analysis.
One-hour persistence provides an accurate forecast and allows the incorporation of previous surface observations into the analysis. more important, it assures continuity between analyses, especially near stations that report less frequently than hourly. Persistence, however, cannot be used in data-void or data-sparse areas such as oceans. In these regions, gridded data from NCEP's Eta model are used as a background to ensure that the analysis does not stray far from reality. The Eta grids are linearly combined with 1-h persistence, using weights calculated to produce a persistence forecast over data-dense areas, a model forecast over data-sparse areas, and a smooth transition between the two.
Since rough terrain can complicate the analysis of surface variables, MSAS attempts to obtain analyses with improved spatial continuity from mountainous observations through careful choice of analysis methods and variables.
MSAS incorporates elevation and potential temperature differences in the correlation functions used to model the spatial correlation of the surface observations. The resulting functions help to take into account physical blocking by mountainous terrain, and improve the representation of surface gradients.
In addition, MSAS analysis variables were chosen, whenever possible, in such a way as to minimize the effects of varying terrain. Potential temperature, for instance, is analyzed instead of surface temperature because it varies more smoothly over mountainous terrain when the boundary layer is relatively deep and well mixed.
The major MSAS pressure variable is a sea level pressure computed at each station from altimeter setting observations. Station pressures calculated from the altimeter settings are reduced to sea level using the 700-mb Eta temperature to estimate an effective surface temperature. This reduction generally provides smoother regional, diurnal, and seasonal variation since it avoids the use of actual surface temperatures, which are often unrepresentative of the surrounding conditions. Moreover, more data are available for analysis of the MSAS reduction because more stations report altimeter setting than report sea level pressure.
Enhancements are under development for future versions of the MSAS system.
Observations Ingested into MSAS
MSAS utilizes most surface observations contained in its domain. These include standard METARs, surface reports from fixed buoys and the NOAA Profiler Network, as well as surface observations from any local mesonets ingested through the LDAD system (refer to Section 9).
Observations failing the automated quality control checks implemented by the QCMS system (refer to Section 10), or listed in the qCMS subjective reject list, are not ingested or analyzed by MSAS.
Analyzed Grids Produced by MSAS
* MSAS Mean Sea Level (MSL)
* NWS Mean Sea Level (MSL) Pressure
* 3 Hour Pressure Change
* Wind Barbs
* Dewpoint Temperature
* Dewpoint Depression
* Potential Temperature
Derived Fields Produced from MSAS Grids
* Lifted Index
* Moisture Convergence
* Equivalent Potential Temperature
* Temperature Advection
* Relative Vorticity
* Potential Temperature Advection
* Equivalent Potential Temperature Advection
Accessing MSAS Grids
Select the MSAS option on the D2D Surface Menu.
Accessing MSAS Observations
In addition to MSAS gridded output, AWIPS has the ability to display the observations used in each MSAS analysis.
The displays consist of color-coded observation plots. Pointing and clicking on any observation gives the station ID associated with the observation. Observations ingested by MSAS, but not used due to QC failures, are distinctly colored. Pointing and clicking on these observations invokes the display of a small QC table indicating which QC checks have failed.
Figure * shows an example
of an MSAS QC table. The observation either failed (F) the automated
checks, or was labeled bad (B) through the subjective intervention procedures.
See Section 10 on the AWIPS QCMS for more information on the QC procedures.
Figure * - Example of an MSAS QC table accessible through the D2D Surface Menu.
To access MSAS observations, select the MSAS option on the D2D Surface Menu. See Figure 2.1.6-25.
Solutions to Potential MSAS Problems
Refer to System Manager's