Independent verification data allow accuracy of the salinity estimates to be gauged.
Just because a model can be fitted closely to training data does not necessarily mean that it will provide accurate estimates of salinity to compliment XBT measurements of temperature. Increasing the number of regressors will always improve the fit but can lead to worse predictions, so it important to have independent data to verify the model's skill. Unfortunately, the available CTD and ARGO data are limited, and in some regions sampling is so sparse that all data are need for fitting. In other regions, there are sufficient data for both training and verification.
Verification data were chosen by examining how many profiles were in each of the 16 20-degrees-longitude by 5-degrees-latitude subregions of the South Atlantic between 25S and 45S. As 5 sub-regions had 51 or fewer profiles while the remaining 11 sub-had 58 or more, verification data were set aside only for those 11 sub-regions. They were chosen randomly, leaving a minimum of 51 profiles for fitting while setting aside no more than 100 profiles in any sub-region.
An ensemble of regression models encompassing all combinations of temperature, temperature squared, longitude, and latitude were fitted to the training data at 25-dbar-spaced levels for all 20x5 shingles with centers on a 4-degrees-longitude by 1-degrees-latitude grid covering the South Atlantic between 25S and 45S. Within each shingle the models were fitted using inverse-square-distance-weighted regression to target its prediction toward the central 4x1 grid cell.
Although the models for each shingle were intended for estimates within the 4x1 central area, there were insufficient verification data within these target areas to gauge their performance. The largest number of verification profiles within any shingle's target area was 31, the others having considerably fewer with most having none. Only estimates for this shingle were verified against the central target data. All others were scored for all verification data encompassed by their fitting shingle, allowing peripheral data to be treated the same as the central data and possibly biasing the scores toward more unfavorable values. Contour plots and histograms of different measures of estimation errors for the verification data give an overview of their performance. The models with all four regressors are among the best for most shingles at most levels.
So far, only the CTD and ARGO data flagged as being the most reliable have been used. By examining the less reliable data it should be possible to obtain more data that might be used to verify the models' preformance.