ADDS - Aviation Digital Data Service

Methodology for Computing GTG Performance Statistics
Back to GTG Performance Statistics
Back to Turbulence Page
Methodology for Computing GTG Performance Statistics
The GTG is the operational version of the Integrated Turbulence Forecasting Algorithm (ITFA), which has been verified over several years. Long-term statistics on ITFAs performance are available on the Real-Time Verification System (RTVS). The statistics that are presented on the GTG performance page were obtained from the RTVS.
   ITFA diagnoses are verified using Yes and No pilot reports (PIREPs) of turbulence conditions. Only forecasts and PIREPs located at altitudes of 20,000 ft and above are considered in the performance statistics, since the current version of GTG is only intended to forecast Clear-Air Turbulence (CAT) at these altitudes. These reports and the GTG forecasts are used to compute three basic statistics: PODy (probability of detection of Yes PIREPs), PODn (probability of detection of No PIREPs), and the % Volume covered by a Yes forecast. Only PIREPs indicating moderate or greater turbulence severity are used to compute PODy and only PIREPs that explicitly state "turbulence negative" or smooth are used to compute PODn.
   PODy can be interpreted as the proportion of Yes reports that are correctly classified as having turbulence conditions. PODn is the proportion of No reports that are correctly classified as not having turbulence conditions. Thus, 1-PODn can be interpreted as the proportion of negative reports that are incorrectly classified. The % Volume is the percentage of the airspace at 20,000 ft and above covered with a Yes turbulence forecast.
   To create the discrimination and airspace coverage plots, GTG is converted from a turbulence severity indicator to a Yes/No forecast, by using a variety of threshold values (the threshold value for each point is shown on the plots). Grid points with GTG values greater than the threshold are classified as "Yes" forecasts; smaller GTG values are classified as "No" forecasts. Then all of the pairs of statistics are computed for each threshold and are plotted to create the diagrams. The discrimination plot shows PODy versus 1-PODn, while the airspace coverage plot shows PODy versus % Volume.
   The discrimination diagram essentially is a "relative operating characteristic" (ROC) plot, based on an area of research called signal detection theory (SDT; Mason 1982). This plot measures the ability of a forecasting system to discriminate between Yes and No observations. It measures the trade-off between correctly classifying Yes observations and incorrectly classifying No observations. For forecasts that are skillful, the ROC curve should lie above the 45-degree line (i.e., curves for better forecasts lie further toward the upper left corner in the diagram). In fact, the area under the ROC curve is a measure of skill, and is called the skill index (SI) on the GTG performance plots. This index ranges from 0 to 100. SI values greater than 50 indicate the forecasts have some skill. Larger values indicate greater skill.
   The airspace coverage plot measures the trade-off between correctly classifying Yes observations, and covering a large amount of airspace with a Yes forecast. Unfortunately, due to the nature of PIREPs, it is inappropriate to compute standard measures of over-warning such as the False Alarm Ratio (FAR; Brown and Young 2000). Thus, the airspace coverage plot provides an alternative measure of over-warning. Better forecasts are indicated by curves that are closer to the upper left corner. Together, the discrimination plot and the airspace coverage plot provide a relatively complete picture of ITFA performance.
   For more information about GTG performance, see Brown et al. (2002) which is the quality assessment report for GTG. Additional information about the verification approach is included in Brown et al. (1997). Additional information about GTG performance is also presented in Brown et al. (2000).

Brown, B.G., G. Thompson, R.T. Bruintjes, R. Bullock, and T. Kane, 1997: Intercomparison of in-flight icing algorithms. Part II: Statistical verification results. Wea. Forecasting, 12, 890-914.
Brown, B.G., and G.S. Young, 2000: Verification of icing and turbulence forecasts: Why some verification statistics can't be computed using PIREPs. Preprints, 9th Conference on Aviation, Range, and Aerospace Meteorology, Orlando, FL, 11-15 September, American Meteorological Society (Boston), 393-398.
Brown, B.G., J.L. Mahoney, J. Henderson, T.L. Kane, R. Bullock, and J.E. Hart, 2000: The turbulence algorithm intercomparison exercise: Statistical verification results. Preprints, 9th Conference on Aviation, Range, and Aerospace Meteorology, Orlando, FL, 11-15 Sept., American Meteorological Society (Boston), 466-471.
Brown, B.G., J.L. Mahoney, R. Bullock, M.B. Chapman, C. Fischer, T.L. Fowler, J.E. Hart, and J.K. Henderson, 2002: Integrated Turbulence Forecasting Algorithm (ITFA): Quality Assessment Report. Report to the FAA Aviation Weather Research Program.
Mason, I., 1982: A model for assessment of weather forecasts. Australian Meteorological Magazine, 30, 291-303.