A machine learning approach to estimating the error in satellite sea surface temperature retrievals

Chirag Kumar, Guillermo Podestá, Katherine Kilpatrick, Peter Minnett

Research output: Contribution to journalArticlepeer-review


Global, repeated, and accurate measurements of Sea Surface Temperature (SST) are critical for weather and climate projections. While thermometers on buoys measure SST relatively accurately, only sensors aboard satellites give global and repeated SST measurements necessary for many applications, including climate modeling. For satellite-based thermal infrared sensors, an atmospheric correction converts calibrated brightness temperatures measured at orbital height into an SST estimate, but imperfect assumptions in the correction algorithm coupled with variability in atmospheric conditions and viewing geometries can lead to a wide range of errors and uncertainties in the satellite-derived SST retrievals. Estimates of the resulting errors are imperative for satellite-derived SST assimilation into climate models. This paper evaluates the use of machine learning Decision Tree algorithms to predict the central tendency and dispersion of errors in satellite-derived SST retrievals. First, using records from the NASA R2014.1 MODIS Aqua SST Matchup Database, which includes matched-up satellite and in situ SST measurements, a set of seven variables was derived that addresses the assumptions and known issues in the satellite SST retrieval process. Then, both Random Forest and Cubist Decision Trees were used to predict the SST residual (satellite SST minus skin-corrected buoy SST) for each matchup. While both Decision Tree methods performed similarly well, the Cubist model is more easily interpreted and yields better predictions of errors for relatively infrequent conditions. Various characteristics of the groups of matchups identified by Cubist were explored, and uncertainty values for the error estimates were derived for each group. Overall, the Cubist model predicted the skin SST residual with a root-mean-squared-error of 0.380 °C across nighttime cloud-filtered domains, demonstrating that a Cubist model is viable for quantitatively and accurately predicting Single Sensor Error Statistics per pixel as required by the Group for High Resolution SST. In this paper, we present the training and testing of both Decision Tree models. Because of its interpretability, we explore in detail the characteristics of the Cubist-derived groups to gain new geophysical insight into the satellite-derived SST retrieval error across different measurement conditions.

Original languageEnglish (US)
Article number112227
JournalRemote Sensing of Environment
StatePublished - Mar 15 2021


  • Cubist
  • Decision trees
  • Machine learning
  • Random forests
  • Sea surface temperature
  • Single sensor error statistics

ASJC Scopus subject areas

  • Soil Science
  • Geology
  • Computers in Earth Sciences


Dive into the research topics of 'A machine learning approach to estimating the error in satellite sea surface temperature retrievals'. Together they form a unique fingerprint.

Cite this