SR SSD 98-39
INTERCOMPARISONS AMONG THE NGM-MOS, AVN-MOS, AND CONSENSUS TEMPERATURE FORECASTS FOR WEST TEXAS
Gregory E. Wilk
NWSO Corpus Christi, TX
In the formulation of National Weather Service zone forecast products, statistical guidance is available from two main sources. First, numerical guidance (MOS - Model Output Statistics) based on output from the Nested Grid Model (NGM) provides the forecaster with air and dew point temperature, wind speed and direction, cloud amount and intervals of height, and other variables every three hours, along with 12-hour forecasts of temperature extremes (maximum and minimum), probability and type of precipitation, and thunderstorm probability, out to 60 hours. Second, MOS guidance is derived from the Aviation (AVN) or Spectral model. This guidance provides 12-hour forecasts of maximum and minimum temperature, probability of precipitation, mean opaque cloudiness, and conditional probability of snow, out to 72 hours.
A study by Vislocky and Fritsch (1995) compared the NGM and LFM-based MOS temperature, wind, cloud and precipitation probability forecasts with a "consensus" forecast, formulated by averaging the two MOS forecasts. Their study, comprising 250-350 stations and spanning a three year period (1990-1992), found consensus forecasts superior to both the NGM and LFM forecasts. Also, when the NGM and LFM MOS temperature forecasts diverged, they found that the consensus forecast should be skewed closer to the LFM.
The purposes of this study were to determine: 1) whether a consensus temperature forecast (henceforth known as "CON") would provide the best forecast for West Texas stations, especially when the NGM and AVN temperature MOS guidance diverged; and 2) what forecast errors can be expected among the NGM, AVN, and CON forecasts. This paper will mainly focus on results from the cool season (October 1 - March 31), since the AVN and NGM guidance diverges more frequently during this period. However, a brief summary of the results during the warm season (April 1 - September 30) will also be presented.
Surface observations were obtained, and maximum and minimum temperatures were determined from October 1, 1995, through September 30, 1996, for five West Texas stations: Amarillo, Lubbock, Midland, San Angelo and El Paso. Since the NGM and AVN MOS guidance attempts to predict minimum temperatures between 7 p.m. and 8 a.m. LST, and maximum temperatures between 7 a.m. and 7 p.m. LST (NWS, 1992 and 1994), these windows were preserved as much as possible. That is, if the minimum temperature occurred shortly after 8 a.m., but did not occur due to any weather phenomenon (e.g., rain-cooled temperatures or the passage of a front), that temperature was used. Computer programs were written to extract the NGM and AVN 24-hr, 36-hr, 48-hr, and 60-hr maximum and minimum temperature forecasts for each station, and combine them with the observed temperatures by creating data tables. MOS performance for each model was then evaluated via several tests, including simple statistical analysis.
CON forecasts were obtained by taking the arithmetic mean between the NGM and AVN. For instances when the CON was not a whole number, CON temperatures were rounded up for odd whole numbers (e.g., 67.5 becomes 68) and rounded down for even whole numbers (66.5 becomes 66). This method theoretically allows for an even dispersion of the non-whole CON forecasts (comparisons between the "pure" CON and "whole" CON mean forecast errors showed differences less than 0.05F).
3.0 DERIVATION OF MOS TEMPERATURE FORECASTS
MOS temperature forecasts are derived from NGM and AVN model output using least squares linear regression equations (NWS, 1992 and 1994). Parameters (predictors) used in the regression equations include observed surface air temperatures near model initialization time (if available), day-of-year considerations, and model forecasts of temperatures, humidity, thickness, and wind at various atmospheric levels or layers. The forecast hour used for each predictor in a particular equation varies; the same forecast hour may be used (e.g., the NGM 24-hr 850mb dew point temperature forecast may be used to obtain the 24-hr minimum temperature forecast), or a different forecast hour may be used (e.g., the NGM 30-hr 850mb dew point temperature forecast may be used to obtain the 24-hr minimum temperature forecast). The predictors vary from station to station, season to season, forecast interval to forecast interval, and temperature extreme being forecast.
The temperature forecast (T) is obtained by multiplying each predictor (xi) by a coefficient (ai, determined from least-squares analysis), summing it with the other predictor-coefficient couplets, then adding the sum to a constant (a0), such that:
T = a0 + a1x1 + a2x2 + a3x3 ... anxn
4.0 SUMMARY OF WEST TEXAS TEMPERATURES
OCTOBER 1995 - SEPTEMBER 1996
Above normal temperatures were observed for all five stations for the period October 1995 - March 1996, with averages ranging from 0.6F at San Angelo to 2.9F at El Paso. The slightly above normal departures (less than 1.0F) at Amarillo, Lubbock, and San Angelo were the result of maximum temperatures of around 2.5F above normal being offset by minimum temperatures of around 1.0F below normal. El Paso was the only station to have both temperature extremes above normal (Midland had near normal minima).
February was the warmest month (3.6F to 6.4F above normal), while March was the coolest (0.5F to 4.5F below normal). Above normal temperatures were also observed during the period April 1996 to September 1996 at all stations except Amarillo (a -0.4F mean departure). Mean departures ranged from 0.8F at Lubbock to 1.9F at Midland, with minimum temperature departures higher than maximum temperature departures.
Table 1 shows the frequency of temperature forecast differences between the NGM and AVN MOS forecasts. Note that the NGM and AVN were more likely to have similar temperature forecasts for minima than for maxima. For the 24-hr forecasts, the NGM and AVN differed by 3F or less at least 55% and 65% of the time for maximum and minimum temperatures, respectively. By 60-hr, the percentages decreased to less than 50% for maxima (except at El Paso) and 63% for minima. Thus (as would be expected), as the forecast period increased, the more likely a CON forecast would differ from the AVN and NGM forecasts.
5.1 AVERAGE ERRORS
Table 2 shows the average errors for the six-month period. As a rule, as the six-month average became more positive (negative), the frequency of forecasts that were too warm (cold) increased. For example, the AVN and NGM overall were too warm at Lubbock, and consequently the frequency of Lubbock forecasts that were too warm exceeded the frequency of forecasts that were too cool by about 2:1 by the 60-hr forecast.
Another interesting thing to note is that guidance generally became warmer with time. By the 60-hr forecast, nearly every model averaged too warm. This seems to suggest that the NGM and AVN over-compensated for a warmer than normal weather pattern. This trend for warmer forecasts with time was more apparent for minimum temperatures than for maximum temperatures, mainly at Amarillo, Lubbock, and San Angelo. Since minimum temperature departures were below normal at Amarillo, Lubbock, and San Angelo, this suggests that more (or different) predictors may be needed in the MOS equations at these stations.
5.2 ABSOLUTE ERRORS
Table 3 shows the mean absolute errors during the cool season. Note the CON had the lowest errors in all but a few cases (the main exception being San Angelo for maximum temperatures). When the NGM and AVN differed by more than 3F (table not shown), the CON still provided the lowest errors, and at times these errors were even lower than the corresponding CON errors shown in Table 3. The CON tended to reduce maximum temperature errors more than minimum temperature errors, improving the next-best forecast model by as much as 1.1F when the NGM and AVN differed by more than 3F (and as much as 0.5F for all data).
In most cases, the AVN tended to perform a little better than the NGM for maximum temperatures, while the opposite was true for minimum temperatures. Finally, it should be noted that NGM errors were not that different from errors found in a previous study done by Dagostaro and Dallavalle (1997).
5.3 FREQUENCY OF BEST FORECAST
Table 4 shows the percentage of time each model gave the best (closest to observed temperature) forecast(1). As can be seen, the CON is NOT the best forecast method on a day-by-day basis (even if one included the "OTH" with the CON numbers). For the most part, neither the AVN or NGM had a clear advantage for maximum temperatures. However, the AVN performed better at San Angelo and for the 60-hr forecast at all stations, while the NGM was preferred at El Paso (mainly the 0000 GMT cycle). For minimum temperatures, the NGM showed a slight advantage over the AVN for the 36-hr and 48-hr forecasts.
Even when the NGM and AVN MOS forecasts differed by more than 3F, the CON still did not provide the best forecast on a day-to-day basis. In these cases, the AVN performed better than the NGM for maximum temperatures (especially for the 60-hr forecast). For minimum temperatures, the NGM was better at Midland, while the AVN was much better at San Angelo after 24-hr (elsewhere no clear advantage was found). Thus, using a CON forecast on a day-to-day basis is not the optimum approach to operational forecasting; one must determine the best MOS forecast (if any) for that forecast package.
5.4 FORECAST ERROR RANGES
Table 5 shows the percentage of time each model forecast fell into the 0F-3F, 0F-5F, 10F or greater ranges for maximum temperatures. In most instances, the CON had the most forecasts in the 0F-3F and 0F-5F ranges. Although the CON only produced forecasts within 3F near or less than 50% of the time after 36-hr (except at El Paso), it did provide forecasts within 5F close to or more than two-thirds of the time through 60-hr. Also, note that the differences between the CON and AVN results were usually small. It is clear, however, that the NGM was the least preferred model (again, except at El Paso).
As Table 6 shows, the CON forecast did slightly better for minimum temperatures at most stations, especially after the 24-hr forecast. Here, the CON produced forecasts within 5F at least two-thirds or more of the time through 60-hr (except at San Angelo, since the NGM did poorly there). Comparing Table 5 with Table 6, it appears that better forecasts were produced for maximum temperatures at Lubbock, San Angelo, and El Paso, and for minimum temperatures at Amarillo and Midland. Also, all three models forecast fewer busts (10F or greater) for minimum temperatures. Finally, the NGM was a little better than the AVN in producing minimum temperature forecasts within 5F at most stations.
When differences between the NGM and AVN MOS forecasts exceeded 3F, the CON still provided the best results most of the time. Notable exceptions to this were maximum temperatures at San Angelo after 36-hr (AVN better), and minimum temperatures at El Paso through 36-hr (NGM better). For maximum temperatures, the NGM was slightly preferred over the AVN at most stations through 36-hr; after 36-hr the AVN was much better than the NGM (even at El Paso). This suggests that a CON forecast might be improved by skewing it toward the AVN after 36-hr. For minimum temperatures, the NGM was better than the AVN, especially after 36-hr (except at San Angelo). This suggests that the CON forecast could be improved by skewing it toward the NGM after 36-hr.
6.0 RESULTS FOR THE WARM SEASON, APRIL THROUGH SEPTEMBER 1996.
During the warm season (April through September 1996), the CON forecast did not always provide the lowest forecast errors, as Table 7 shows. Although the absolute error differences between the best and worst model forecasts were never greater than 0.8F for maximum temperatures and 0.5F for minimum temperatures, the NGM tended to produce the largest errors, especially after 36-hr. Exceptions to this rule were the 24-hr minimum temperature forecasts at all stations, and at El Paso for maximum temperatures. At Lubbock and El Paso, the AVN (and to a lesser degree, the NGM) tended to forecast maximum temperatures that were too warm (by more than 2:1 at times). This suggests that, in the long term, a forecaster's absolute errors may be decreased if he/she goes slightly below AVN guidance at these stations.
For minimum temperatures, model forecasts were usually too cold at San Angelo, Midland and El Paso, with mean errors of near -1.0F, -2.2F, and -3.2F for the six-month period, respectively. Thus, long-term minimum temperature forecast errors may be decreased at these stations if forecasters go slightly above guidance. However, tests using more data will be needed to confirm these results.
For maximum temperatures, the NGM was least likely to fall in the 0F-3F and 0F-5F ranges (except at El Paso), while little differences between all three forecasts were seen for minimum temperatures. As would be expected, MOS from both models along with CON provided more forecasts within 3F and 5F of observed temperatures than they did during the cool season, and had fewer forecast busts (again, El Paso was the exception). By the 60-hr forecast, 66-85% of CON maximum and 73-88% of CON minimum forecasts were within 5F of observed temperatures. Overall, guidance was better for minimum temperatures at Amarillo, Lubbock, and San Angelo, and better for maximum temperatures at El Paso and Midland.
Finally, as would be expected, there were many fewer times the NGM and AVN MOS guidance differed by more than 3F during the six-month period. The number of times this occurred ranged from as little as 9 and 27 days for the 24-hr minimum and maximum temperature forecasts, respectively, to no more than 35 and 75 times for the 60-hr maximum and minimum temperature forecasts, respectively. When considering results where 30 or more days were available (mainly maximum temperatures beyond 24-hr), the CON or AVN MOS forecast was preferred (except at El Paso), with the AVN best for the 60-hr forecast. However, these results may be questionable due to the limited amount of data.
Results from the cool season (October 1995 through March 1996) clearly showed that the CON forecast provided the lowest mean forecast errors for the six-month period. Also, the CON produced more forecasts within 5F of observed temperatures than either the NGM or AVN. However, on a day-to-day basis, the CON clearly is not the best forecast method. Additionally, the NGM and AVN MOS forecasts (and therefore the CON) tended to become warmer as the forecast lead time increased, thus seemingly over-compensating for the warmer than normal temperatures observed during the period.
Results also showed that as model guidance diverges and forecast hour increases, a CON forecast could be improved even more by skewing it toward the preferred model at that station; in most cases the AVN for maximum temperatures and NGM for minimum temperatures.
As with any investigation, results using more data (including times when observed temperature departures were different from this study) would help substantiate the findings in this West Texas study. Hopefully, this study will help West Texas forecasters realize the errors they may encounter when using temperature guidance at face value. Forecasters should always keep in mind that MOS guidance is largely based on model forecast parameters. Thus, if the model forecast parameters are poor, MOS guidance will be poor.
The author wishes to thank the Techniques Development Lab at NWS Headquarters for the NGM and AVN MOS equations they graciously provided for this study. Also, thanks to Loren Phillips (SOO, NWSFO Lubbock) for his input, comments and suggestions.
Dagostaro, V.J. and J.P Dallavalle, 1997: AFOS-Era Verification of Guidance and Local Aviation/Public Weather Forecasts--No. 23 (October 1994-March 1995). TDL Office Note 97-3, 53pp.
National Weather Service, 1992: NGM-Based MOS Guidance - The FOUS14/FWC Message. NWS Technical Procedures Bulletin No. 408. Techniques Development Laboratory, NOAA, U.S. Department of Commerce, 7pp.
National Weather Service, 1994: The AVN-Based Statistical Guidance Message. Technical Procedures Bulletin No. 415. Techniques Development Laboratory, NOAA, U.S. Department of Commerce, 5pp.
Vislocky, R.L. and J.M. Fritsch, 1995: Improved model output statistics forecasts through model consensus. Bull Amer. Met. Soc., 76, No. 7, 1157-1164.
1. The term "OTH" in the table includes instances when either the differences between 1) observed minus NGM temperatures and observed minus CON temperatures were the same, or 2) observed minus AVN and observed minus CON temperatures were the same. In most instances, neither the NGM/CON or AVN/CON clearly dominated the "OTH" category.