TY - JOUR
T1 - A machine learning approach to estimating the error in satellite sea surface temperature retrievals
AU - Kumar, Chirag
AU - Podestá, Guillermo
AU - Kilpatrick, Katherine
AU - Minnett, Peter
N1 - Funding Information:
The authors acknowledge grant 80NSSC18K0534 from the U.S. National Aeronatics and Space Administration (NASA).
Publisher Copyright:
© 2020 Elsevier Inc.
Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
PY - 2021/3/15
Y1 - 2021/3/15
N2 - Global, repeated, and accurate measurements of Sea Surface Temperature (SST) are critical for weather and climate projections. While thermometers on buoys measure SST relatively accurately, only sensors aboard satellites give global and repeated SST measurements necessary for many applications, including climate modeling. For satellite-based thermal infrared sensors, an atmospheric correction converts calibrated brightness temperatures measured at orbital height into an SST estimate, but imperfect assumptions in the correction algorithm coupled with variability in atmospheric conditions and viewing geometries can lead to a wide range of errors and uncertainties in the satellite-derived SST retrievals. Estimates of the resulting errors are imperative for satellite-derived SST assimilation into climate models. This paper evaluates the use of machine learning Decision Tree algorithms to predict the central tendency and dispersion of errors in satellite-derived SST retrievals. First, using records from the NASA R2014.1 MODIS Aqua SST Matchup Database, which includes matched-up satellite and in situ SST measurements, a set of seven variables was derived that addresses the assumptions and known issues in the satellite SST retrieval process. Then, both Random Forest and Cubist Decision Trees were used to predict the SST residual (satellite SST minus skin-corrected buoy SST) for each matchup. While both Decision Tree methods performed similarly well, the Cubist model is more easily interpreted and yields better predictions of errors for relatively infrequent conditions. Various characteristics of the groups of matchups identified by Cubist were explored, and uncertainty values for the error estimates were derived for each group. Overall, the Cubist model predicted the skin SST residual with a root-mean-squared-error of 0.380 °C across nighttime cloud-filtered domains, demonstrating that a Cubist model is viable for quantitatively and accurately predicting Single Sensor Error Statistics per pixel as required by the Group for High Resolution SST. In this paper, we present the training and testing of both Decision Tree models. Because of its interpretability, we explore in detail the characteristics of the Cubist-derived groups to gain new geophysical insight into the satellite-derived SST retrieval error across different measurement conditions.
AB - Global, repeated, and accurate measurements of Sea Surface Temperature (SST) are critical for weather and climate projections. While thermometers on buoys measure SST relatively accurately, only sensors aboard satellites give global and repeated SST measurements necessary for many applications, including climate modeling. For satellite-based thermal infrared sensors, an atmospheric correction converts calibrated brightness temperatures measured at orbital height into an SST estimate, but imperfect assumptions in the correction algorithm coupled with variability in atmospheric conditions and viewing geometries can lead to a wide range of errors and uncertainties in the satellite-derived SST retrievals. Estimates of the resulting errors are imperative for satellite-derived SST assimilation into climate models. This paper evaluates the use of machine learning Decision Tree algorithms to predict the central tendency and dispersion of errors in satellite-derived SST retrievals. First, using records from the NASA R2014.1 MODIS Aqua SST Matchup Database, which includes matched-up satellite and in situ SST measurements, a set of seven variables was derived that addresses the assumptions and known issues in the satellite SST retrieval process. Then, both Random Forest and Cubist Decision Trees were used to predict the SST residual (satellite SST minus skin-corrected buoy SST) for each matchup. While both Decision Tree methods performed similarly well, the Cubist model is more easily interpreted and yields better predictions of errors for relatively infrequent conditions. Various characteristics of the groups of matchups identified by Cubist were explored, and uncertainty values for the error estimates were derived for each group. Overall, the Cubist model predicted the skin SST residual with a root-mean-squared-error of 0.380 °C across nighttime cloud-filtered domains, demonstrating that a Cubist model is viable for quantitatively and accurately predicting Single Sensor Error Statistics per pixel as required by the Group for High Resolution SST. In this paper, we present the training and testing of both Decision Tree models. Because of its interpretability, we explore in detail the characteristics of the Cubist-derived groups to gain new geophysical insight into the satellite-derived SST retrieval error across different measurement conditions.
KW - Cubist
KW - Decision trees
KW - MODIS
KW - Machine learning
KW - Random forests
KW - Sea surface temperature
KW - Single sensor error statistics
UR - http://www.scopus.com/inward/record.url?scp=85099484913&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85099484913&partnerID=8YFLogxK
U2 - 10.1016/j.rse.2020.112227
DO - 10.1016/j.rse.2020.112227
M3 - Article
AN - SCOPUS:85099484913
VL - 255
JO - Remote Sensing of Environment
JF - Remote Sensing of Environment
SN - 0034-4257
M1 - 112227
ER -