I have read the description of the Kalman filter, but I am not clear on how it comes together in practice. It appears to be primarily targeted at mechanical or electrical systems since it wants linear state transitions and that it is not useful for anomaly detection or locating state transitions for the same reason (it wants linear state transitions), is that correct? In practice, how does one typically find the components that are expected to be known in advance to use a Kalman filter. I have listed the components, please correct me if my understanding of what needs to be known in advance is incorrect.
I believe these do not need to be known "in advance":
- Process noise $\mathbf w$
- Observation noise $\mathbf v$
- Actual state $\mathbf x$ (this is what the Kalman filter tries to estimate)
I believe these need to be known "in advance" to use a Kalman filter:
- The linear state transition model which we apply to $\mathbf x$ (we need to know this in advance, so our states must be governed by known laws, i.e. the Kalman filter is useful for correcting measurements when the transition from one state to another is well understood and deterministic up to a bit of noise - it is not an anomaly finder or a tool to find random state changes)
- Control vector $\mathbf u$
- Control input model which is applied to control vector $\mathbf u$ (we need to know this in advance, so to use a Kalman filter we also need to know in advance how our controls values affect the model, up to at most some gaussian noise, and the effect needs to be linear)
- Covariance $\mathbf Q$ of the process noise (which appears to be time dependent in the wikipedia article, i.e. it depends on the time $k$) - it appears we need to know this in advance and over time, I assume in practice it is taken as being constant?
- A (linear) observation model $\mathbf H$
- Covariance $\mathbf R$ (which appears to also be time dependent in the wikipedia article) - similar issues to $\mathbf Q$
P.S. And yes I know many of these depend on time, I just dropped all the subscript clutter. Feel free to imagine small letter $k$ to the right and down from each variable name if you would like to.
Answer
For some context, let's go back to the Kalman Filter equations:
$\mathbf{x}(k+1) = \mathbf{F}(k) \mathbf{x}(k) + \mathbf{G}(k) \mathbf{u}(k) + \mathbf{w}(k) \\ \mathbf{z}(k) = \mathbf{H}(k) \mathbf{x}(k) + \mathbf{v}(k)$.
In short, for a plain vanilla KF:
$\mathbf{F}(k)$ must be fully defined. This comes straight from the differential equations of the system. If not, you have a dual estimation problem (i.e. estimate both the state and the system model). If you don't have differential equations of the system, then a KF isn't for you!
$\mathbf{x}(k)$ is, by definition, unknowable. After all, if you knew it, it wouldn't be an estimation problem!
The control vector $\mathbf{u}(k)$ must be fully defined. Without additional system modelling, the only uncertainty on the control vector may be AWGN, which may be incorporated into the process noise. Known matrix $\mathbf{G}(k)$ relates the control input to the states - for example, how aileron movement affects the roll of an aircraft. This is mathematically modelled as part of the KF development.
The system process noise $\mathbf{w}(k)$ is also, by definition, unknowable (since it is random noise!). However, the statistics of the noise must be known which, for a plain vanilla KF, must be zero mean AWGN with known covariance $\mathbf{Q}(k)$. Sometimes, the covariance of the noise may change between samples, but in many cases it is fixed and therefore $\mathbf{Q}$ is a constant. In some instances, this will be known, but in many instances, this will be "tuned" during system development.
Observations are a similar story. The matrix that relates your measurements to the states $\mathbf{H}(k)$ must be fully defined. Your measurements $\mathbf{z}(k)$ are also known because that's the reading from your sensors!
The sensor measurements, however, are corrupted by AWGN $\mathbf{v}(k)$, which, being random noise, is by definition unknown. The statistics of the noise must be known, which is zero mean with covariance $\mathbf{R}(k)$. Once again, the covariance may change with time, but for many applications, it is a fixed value. Often, your sensors will have known noise characteristics from the datasheet. Otherwise, it's not too hard to determine the mean and variance of your sensors that you need to use. Yes, this can also be "tuned" empirically.
There are a huge number of "tricks" that can be done to work around the restrictions in a plain vanilla KF, but these are far beyond the scope of this question.
Afterthought:
Whilst googling for "Kalman Filter" results in a million hits, there are a couple of things that I think are worth looking at. The wikipedia page is a too cluttered to learn from effectively :(
On AVR Freaks, there is an "equation free" intro to the Kalman Filter that I wrote some time ago to try to introduce where it is used for real.
If you're not afraid of maths, there are several books worth reading that are at the senior undergraduate/early postgraduate level. Try either Brown and Hwang which includes all the theory and plenty of example systems. The other that comes highly recommended but I have not read is Gelb, which has the distinct advantage of being cheap!
No comments:
Post a Comment