I always hear that wavelet transform is not shift invariant, and that there are other types of wavelet, like stationary wavelet and double density dual-tree wavelet transform, that are shift invariant.
Can anyone explain to me, what is the meaning of "shift invariant" .
Answer
Let's say you have a signal which is all zeros except for a spike at one point where x(8)=1 (total N=32, for example). If you perform the DWT on this signal and then calculate the total energy (by taking the square root of the sum of the squares of all the results), you will get a value - call it "E1".
Now, let's take another signal which is still all zeros except at one point, n=12, where x(12)=1 - JUST LIKE THE FIRST SIGNAL. It is the same signal except it is shifted by 4 points. If you take the DWT of this signal and calculate its total energy (again by taking the square root of the sum of the squares of the results), you get another value - call it "E2".
You might expect that the total energy of these two would be the same - but they're not! E1 will NOT equal E2, so therefore the DWT is NOT "shift-invariant" because the energy "varies" whenever you "shift" the incoming signal - even though it's basically the same signal.
So transforms like the dual-tree complex discrete wavelet transform have the property that they are "shift-invariant", which means that you can shift the signal around before calculating the transform and the resulting energy will still be the same. This is important for all sorts of reasons, but if you think about it, shift-invariance is a really good thing to have because it means that it doesn't matter WHERE in the signal you start calculating - it will still have the same energy coming out.
This page has a pretty good description and demonstrations non-shift-invariant transforms in 1D, 2D, and 3D.
Here's a really simplistic example that (I hope) illustrates the main idea of shift-invariance:
Suppose you are trying to find the wavelet transform of a piece of music you have, but the WAV file you used has 5 seconds of silence before the music starts. You get the wavelet transform coefficients and graph them nicely on a scalogram (a time/scale graph - which you can think of as a time/frequency graph). You notice that it indicates 5 seconds of silence (0 values) before the musci starts.
Now you edit the WAV file to cut out the 5 seconds of silence down to 1 second of silence (you are shifting the sample in time by doing this - hence "shift" in shift-invariance). You take the discrete wavelet transform again and plot it in a scalogram and you expect to see the same thing except without the silence - you expect it to just be shifted over to the left, but...
It won't be the same! In fact, if you take the total energy of the resultant signal, it won't even be the same, because the DWT by itself is NOT shift-invariant.
But if you use the UDWT (also called "stationary" DWT), then the two images will be the same except the second one will not have all that silence in the beginning. You could take a screenshot of the first one and crop it in photoshop to get the same image as in the second one.
The dual-tree complex DWT (DTCDWT) is near-shift-invariant, meaning it's very close to being shift-invariant.
The term literally means "does not vary when shifted".
Hope that helps!
No comments:
Post a Comment