Sunday, May 21, 2017

speech - Is Cross correlation between stationary random signal and noise zero for all lags?


If $x(n)$ is a stationary random process (let say small 20ms window of speech which can be assumed stationary) and $v(n)$ is said to be uncorrelated noise, can we say that cross correlation of $x(n)$ and $v(n)$ is always zero, for all lags? E.g.


$$r_{xv}(\tau) = E\{x(n) v(n+\tau)\} = 0, \forall \tau$$



Or,


$$\sum_{n=0}^{N}x(n) v(n+\tau) = 0, \forall \tau$$


where N is the signal length?



Answer



As Marcus alludes to in the comments on the question, you are using two different definitions of cross-correlation.


The first definition of cross-correlation, the statistical one, is: $$r_{xv}(\tau) = E\{x(n) v(n+\tau)\} = 0, \forall \tau\tag{1}$$ where, the expectation operator means: $$ E\{ z \} = \int_{-\infty}^{+\infty} z p_z(z) dz \tag{2} $$ where $p_z(z)$ is the probability density function of the random variable argument of $E\{ \cdot \}$, z in equation (2). Equation (2) is sometimes called the ensemble average of $z$.


It is true that if $x$ and $v$ are uncorrelated, then (1) is true.


The second definition in equation (3) (or the normalized version of it) is what we usually use because we don't know what $p_z(z)$ is. In order to use the second definition, we have to make various assumptions about the relationship between $x$ and $v$, one of which is ergodicity (that time-averages of one realization of each can be substituted for the ensemble averages).


$$\sum_{n=0}^{N}x(n) v(n+\tau) = 0, \forall \tau \tag{3}$$


The normalized version is: $$\frac{1}{N}\sum_{n=0}^{N}x(n) v(n+\tau) = 0, \forall \tau \tag{4}$$



Now we are using time averages to find the cross-correlation. For any given realization of $x$ and $v$, we are not guaranteed that equation (3) ( or (4)) is true.


However, what generally happens is that the mathematics surrounding any theory using cross-correlation uses (1) while the implementation uses (3).


That is why implementations of cross-correlation, such as xcorr in Matlab, do not output a vector of zeros with uncorrelated input time series.




The confusion about cross-correlation seems to be widespread. As @msm notes in the comments, the Wikipedia page for cross-correlation makes no mention of its statistical origins.


The whole point of correlation (either auto or cross) is to see how similar one signal is to another, at various time-lags. For the auto-correlation case, the aim is to see how "predictable" the signal is --- knowing the values up until sample $n$, how much can I say about the sample values after time $n$ before they occur?


Let's start with a Wikipedia page that hasn't forgotten these origins.


This page starts with the original version of normalized correlation: $$ \rho_{XY} = \frac{\textrm{cov}(X,Y)}{\sigma_X\sigma_Y} = \frac{E[(X - \mu_X)(Y - \mu_Y)]}{\sigma_X\sigma_Y} $$


This is another point of confusion for many: staticians generally require that the correlation is normalized to being between -1 and +1. Signal processing engineers tend to do away with this requirement.


The trouble with $\rho_{XY}$ as defined above is that, as in the earlier part of my answer, taking the expectation requires knowledge of the statistical properties of the signals, which we often don't have or have to guess.



That is why the actual cross-correlation is usually substituted for the sample cross-correlation: $$ r_{XY} = \frac{\displaystyle\sum_{n=1}^{N} (x_n - \bar{x}) (y_n - \bar{y})}{n s_X s_Y} $$ where $s_X$ and $s_Y$ are the sample standard deviations and $\bar{x}$ and $\bar{y}$ are the sample means.


No comments:

Post a Comment

periodic trends - Comparing radii in lithium, beryllium, magnesium, aluminium and sodium ions

Apparently the of last four, $\ce{Mg^2+}$ is closest in radius to $\ce{Li+}$. Is this true, and if so, why would a whole larger shell ($\ce{...