Reading Notes | Optimal Estimation of Dynamic Systems (Crassidis, 2011)

John L. Crassidis, and John L. Junkins, Optimal Estimation of Dynamic Systems, CRC Press, 2011.

Corrections to the book can be found at: http://ancs.eng.buffalo.edu/index.php/Optimal_Estimation_of_Dynamic_Systems_2nd_Edition

2.5. Maximum Likelihood Estimation

2.6. Properties of Maximum Likelihood Estimation

Invariance Principle

Consistent Estimator

Asymptotically Gaussian Property

Asymptotically Efficient Property

2.7. Bayesian Estimation

2.7.1. MAP Estimation

Maximum a posteriori estimation (MAP)

  • if the a priori distribution p(x^)p(\hat{\bm{x}}) is uniform, then MAP estimation is equivalent to maximum likelihood (MLL) estimation.
  • asymptotic consistency and efficiency
  • MAP estimator converges to MLL estimator for large samples
  • obeys the invariance principle

2.7.2. Minimum Risk Estimation

assume x\bm{x} is distributed by the a posteriori function p(xy~)p(\bm{x}|\tilde{\bm{y}})

assign a cost to any loss suffered due to errors in the estimate

The risk funtion is

JMR(x)=c(xx)p(xy~)dx(2.171)J_{\rm MR}(\bm{x}^*) = \int_{-\infty}^{\infty} c(\bm{x}^*|\bm{x}) p(\bm{x}|\tilde{\bm{y}}) d\bm{x} \tag{2.171}

[WHAT’S THE RELATION TO GP ?? LOOKS LIKE A PREDICTIVE DISTRIBUTION.]

For these reasons (see p97), minimum risk approaches are often avoided in practical estimation problems, although the relationship between decision theory and optimal estimation is very interesting.

2.8. Advanced Topics (not read yet)

2.8.1. Nonuniqueness of the Weight Matrix

2.8.2. Analysis of Covariance Errors

2.8.3. Ridge Estimation

2.8.4. Total Least Squares

3. Sequential State EStimation

3.2. Full-Order Estimators

Ackermann’s formula

3.2.1. Discrete-Time Estimators

Discrete state-space representation

xk+1=Φxk+Γukyk=Hxk+Duk(A.122)\begin{aligned} \bm{x}_{k+1} &= \Phi \bm{x}_k + \Gamma \bm{u}_k\\ \bm{y}_k &= H \bm{x}_k + D \bm{u}_k \end{aligned} \tag{A.122}

x^k+1=Φx^k++Γukx^k+=x^k+K[y~Hx^k](3.20)\begin{aligned} \hat{\bm{x}}_{k+1}^- &= \Phi \hat{\bm{x}}_k^+ + \Gamma\bm{u}_k\\ \hat{\bm{x}}_k^+ &= \hat{\bm{x}}_k^- + K [\tilde{\bm{y}}-H\hat{\bm{x}}_k^-] \end{aligned} \tag{3.20}

3.3. The Discrete-Time Kalman Filter

The Kalman filter provides a rigorous theoretical approach to “place” the poles of the estimator, based upon stochastic processes for the measurement error and model error.

  • both discrete-time dynamic models and measurements
  • both continuous-time dynamic models and measurements
  • continuous-time dynamic models with discrete-time measurements

3.3.1. Kalman Filter Derivation

[INSERT TABLE 3.1 HERE]

3.3.2 Stability and Joseph’s Form

Another form for Pk+1P_{k+1} that ensures to be positive definite:

Using

Pk+=[IKkHk]Pk[IkkHk]T+KkRkKkT(3.39)P_k^+ = [I-K_kH_k]P_k^-[I-k_kH_k]^T + K_kR_kK_k^T \tag{3.39}

instead of

Pk+=[IKkHk]Pk(3.44)P_k^+ = [I-K_kH_k]P_k^- \tag{3.44}

Proof: see p149–151.

3.3.3. Information Filter and Sequential Processing

handle large measurement vectors

The information filter may be more computationally efficient than the traditional Kalman filter when the size of the measurement vector is much larger than the size of the state vector.

Another more commonly used approach is to use sequential processing.
processing one measurement at a time, then the tain and covariance are updated until all measurements at each sampling instant have been processed.

3.3.4. Steady-State Kalman Filter

For time-invariant systems the error covariance PP reaches a steady-state value very quickly.
Therefore, a constant gain KK can be pre-computed using the steady-state covariance.
The solution is suboptimal in the strictest sense but can save a lot in computations.

3.3.5. Relationship to Least Squares Estimation

The Kalman filter can be derived using a least squares type loss function

J=12(x^0x0)TPa0(x^0x0)+12i=1k(y~Hix^i)TRi1(y~Hix^i)(3.109)J = \frac{1}{2}(\hat{\bm{x}}_0-\bm{x}_0)^T\mathcal{Pa}_0(\hat{\bm{x}}_0-\bm{x}_0) + \frac{1}{2} \sum_{i=1}^k (\tilde{\bm{y}}-H_i\hat{\bm{x}}_i)^T R_i^{-1} (\tilde{\bm{y}}-H_i\hat{\bm{x}}_i) \tag{3.109}

subject to the constraint

x^i+1=Φ(i+1,i)x^i,i=1,2,,k1(3.110)\hat{\bm{x}}_{i+1} = \Phi(i+1,i)\hat{\bm{x}}_i,\quad i=1,2,\dots,k-1 \tag{3.110}

3.3.6. Correlated Measurement and Process Noise

3.3.7. Cramer-Rao Lower Bound

see wikipedia:
In its simplest form, the bound states that the variance of any unbiased estimator is at least as high as the inverse of the Fisher information.\

3.3.8. Orthogonality Principle

the orthogonality of the estimate and its error

E{x^k+x~k+T}=0(3.152)E\{\hat{\bm{x}}_k^+\tilde{\bm{x}}_k^{+T}\} = 0 \tag{3.152}

where

x~k+x^k+xk(3.32)\tilde{\bm{x}}_k^+ \equiv \hat{\bm{x}}_k^+ - \bm{x}_k \tag{3.32}

This orthogonality principle is extremely important in the derivation of the linear quadratic-Gaussian controller of Sec. 8.6.

3.4. The Continuous-Time Kalman Filter

3.5. The Continuous-Discrete Kalman Filter

x˙(t)=F(t)x(t)+B(t)u(t)+G(t)w(t)y~k=Hkxk+vk\begin{aligned} \dot{\bm{x}}(t) &= F(t)\bm{x}(t) + B(t)\bm{u}(t) + G(t)\bm{w}(t)\\ \tilde{\bm{y}}_k &= H_k\bm{x}_k + \bm{v}_k \end{aligned}

[INSERT TABLE 3.7]

The continuous-time propagation model equation does not involve the measurement directly. The covariance propagation follows a continuous-time Lyapunov differential equation.

the sample times of the measurements need not occur in regular intervals. In fact, different measurement sets can be spread out over various time intervals.

QUESTIONS:

  • During the propagation, how is G(t)Q(t)G(t)TG(t)Q(t)G(t)^T handled?
  • How is it handled in Orekit?

3.6. Extended Kalman Filter

The problem with this nonlinear model is that a Gaussian input does not necessarily produce a Gaussian output (unlike the linear case).

The fundamental concept of this filter involves the notion that the true state is sufficiently close to the estimated state. Therefore, the error dynamics can be represented fairly accurately by a linearized first-order Taylor series expansion.

[INSERT TABLE 3.8]

A steady-state gain cannot be found, because F(t)F(t) and H(t)H(t) in Table 3.8 will not be constant in general.

Another approach involves linearizing about the normal (a priori) state vector xˉ(t)\bar{\bm{x}}(t) instead of the current estiamte x^(t)\hat{\bm{x}}(t).

This gives the linearized Kalman filter. In general, the linearized Kalman filter is less accurate than the extended Kalman filter since xˉ(t)\bar{\bm{x}}(t) is usually not as close to the truth as is x^(t)\hat{\bm{x}}(t). However, since the nominal state is known a priori the gain K(t)K(t) can be pre-computed and stored, which reduces the on-line computational burden.

[INSERT TABLE 3.9]

Proving convergence in the extended Kalman filter is difficult (if not impossible!) even for simple systems where the initial condition is not well known.
But it is often robust to initial condition errors, which can often be verified through simulation.

iterated extended Kalman filter:
local iteration at step kk, until no longer improved.

nonlinear smoother back to time tk1t_{k-1} (see Sec. 5.1.3):
The reference trajectory over [t_{k-1],t_k) can also be improved once the measurement y~k\tilde{\bm{y}}_k is given.

3.7. Unscented Filtering (not needed for now)

3.8. Constrained Filtering (not needed for now)

3.9. Summary

All important formulas are summarized here.

4. Advanced Topics in Sequential State Estimation ()

5. Batch State Estimation (to read later)

Smoothing ??

5.1.3. (utilized in Sec. 7.4., to read)

6. Parameter Estimation: Applications (to read now)

6.2. Global positioning System Navigation

7. Estimation of Dynamic Systems: Applications

7.3. Orbit EStimation

6.4. Orbit Determination

Gaussian Least Squares Differential Correction (GLSDC)

(6.53)\tag{6.53}

(6.54)\tag{6.54}

Ffx=[03×3I3×3F2103×3](6.57)F \equiv \frac{\partial\bm{f}}{\partial\bm{x}} = \begin{bmatrix}0_{3\times3}&I_{3\times3}\\F_{21}&0_{3\times3}\end{bmatrix} \tag{6.57}

hx=[H1103×3]\frac{\partial \bm{h}}{\partial\bm{x}} = \begin{bmatrix}H_{11}&0_{3\times3}\end{bmatrix}

7.4. Target Tracking of Aircraft

The algorithm for orbit determination is essentially equivalent to the nonlinear fixed-point smoother in Sec. 5.1.3 with a covariance reset. [CHECK THIS!!!]

only need to analytically compute Eqs. 6.53, 6.54, 6.57, 6.67.

Appendix B. Matrix Properties

Appendix E. Computer Software

All the MATLAB codes are available at http://ancs.eng.buffalo.edu/index.php/Optimal_Estimation_of_Dynamic_Systems_2nd_Edition