Isoline retrieval explained

Isoline retrieval is a remote sensing inverse method that retrieves one or more isolines of a trace atmospheric constituent or variable. When used to validate another contour, it is the most accurate method possible for the task. When used to retrieve a whole field, it is a general, nonlinear inverse method and a robust estimator.

For validating advected contours

Rationale

Suppose we have, as in contour advection, inferred knowledge of asingle contour or isoline of an atmospheric constituent, qand we wish to validate this against satellite remote-sensing data.Since satellite instruments cannot measure the constituent directly,we need to perform some sort of inversion.In order to validate the contour, it is not necessary to know,at any given point, the exact value of the constituent. We only need toknow whether it falls inside or outside, that is, is it greaterthan or less than the value of the contour, q0.

This is a classification problem. Let:

j=\begin{cases}1;&q<q0\\ 2;&q\geqq0\end{cases}

be the discretized variable.This will be related to the satellite measurement vector,

\vecy

,by some conditional probability,

P(\vecy|j)

,which we approximate by collecting samples, called training data, of both themeasurement vector and the state variable, q.By generating classification results over the region of interestand using any contouring algorithm to separate thetwo classes, the isoline will have been "retrieved."

The accuracy of a retrieval will be given by integratingthe conditional probability over the area of interest, A:

a=

1
A

\intAP\left[c(\vec{r})|\vec{y}(\vec{r})\right] d\vec{r}

where c is the retrieved class at position,

\vecr

.We can maximize this quantity by maximizing the value of the integrandat each point:

max(a)=

1
A

\intA\left\lbracemaxjP\left[j| \vec{y}(\vec{r})\right]\right\rbraced\vec{r}

Since this is the definition of maximum likelihood,a classification algorithm based on maximum likelihoodis the most accurate method possible of validating an advected contour.A good method for performing maximum likelihood classificationfrom a set of training data is variable kernel density estimation.

Training data

There are two methods of generating the training data.The most obvious is empirically, by simply matching measurements ofthe variable, q, with collocatedmeasurements from the satellite instrument. In this case,no knowledge of the actual physics that produce the measurementis required and the retrieval algorithm is purely statistical.The second is with a forward model:

\vecy=\vecf(\vecx)

where

\vecx

is the state vector andq = xk is a single component.An advantage of this method is that state vectors need notreflect actual atmospheric configurations, they need onlytake on a state that could reasonably occur in the real atmosphere.There are also none of the errors inherent inmost collocation procedures,e.g. because of offset errors in the locations of the paired samplesand differences in the footprint sizes of the two instruments.Since retrievals will be biased towards more common states,however, the statistics ought to reflect those in the real world.

Error characterization

The conditional probabilities,

P(\vecy|j)

, provideexcellent error characterization, therefore the classificationalgorithm ought to return them.We define the confidence rating by rescaling the conditionalprobability:

C=

ncP(c|\vecy)-1
nc-1

where nc is the number of classes (in this case, two).If C is zero, then the classification is little better thanchance, while if it is one, then it should be perfect.To transform the confidence rating to a statistical tolerance,the following line integral can be applied to an isoline retrievalfor which the true isoline is known:

\delta(C)=

1
l
l
\int
0

h(C-C\prime(\vec{r}))ds

where s is the path, l is the length of the isolineand

C\prime

is the retrieved confidence as a functionof position.While it appears that the integral must be evaluated separatelyfor each value of the confidence rating, C, in fact it may bedone for all values of C by sorting the confidence ratings of theresults,

C\prime

.The function relates the threshold value of the confidence ratingfor which the tolerance is applicable.That is, it defines a region that contains a fraction of the trueisoline equal to the tolerance.

Example: water vapour from AMSU

The Advanced Microwave Sounding Unit (AMSU) series of satellite instrumentsare designed to detect temperature and water vapour. They have a highhorizontal resolution (as little as 15 km) and because they aremounted on more than one satellite, full global coverage can beobtained in less than one day.Training data was generated using the second method fromEuropean Centre for Medium-Range Weather Forecasts (ECMWF) ERA-40data fed to a fast radiative transfer model calledRTTOV.The function,

\delta(C)

has been generated fromsimulated retrievals and is shown in the figure to the right.This is then used to set the 90 percent tolerance in the figurebelow by shading all the confidence ratings less than 0.8.Thus we expect the true isoline to fall within the shading90 percent of the time.

For continuum retrievals

Isoline retrieval is also useful for retrieving a continuum variableand constitutes a general, nonlinear inverse method.It has the advantage over both a neural network, as well as iterativemethods such as optimal estimation that invert the forward modeldirectly, in that there is no possibility of getting stuck in alocal minimum.

There are a number of methods of reconstituting the continuum variablefrom the discretized one. Once a sufficient number of contourshave been retrieved, it is straightforward to interpolate betweenthem. Conditional probabilities make a good proxy forthe continuum value.

Consider the transformation from a continuum to a discrete variable:

P(1|\vec{y})=

q0
\int
-infty

P(q|\vec{y})dq

P(2|\vec{y})=

infty
\int
q0

P(q|\vec{y})dq

Suppose that

P(q|\vecy)

is given by a Gaussian:

P(q|\vecy)=

1
\sqrt{2\pi

\sigmaq} \exp\left\lbrace-

\left[q-\barq(\vecy)\right]2
2\sigmaq

\right\rbrace

where

\barq

is the expectation value and

\sigmaq

is the standard deviation, then the conditional probability is related to thecontinuum variable, q, by the error function:

R=P(2|\vec{y})-P(1|\vec{y})=erf\left[

q0-\barq(\vecy)
\sqrt2\sigmaq

\right]

The figure shows conditional probability versus specific humidity for the exampleretrieval discussed above.

As a robust estimator

The location of q0 is found by setting the conditional probabilitiesof the two classes to be equal:

q0
\int
-infty

P(q|\vec{y})dq=

infty
\int
q0

P(q|\vec{y})dq

In other words, equal amounts of the "zeroeth order moment" lie on either sideof q0. This type of formulation is characteristic of a robust estimator.

References

External links