window - Moving average vs. Moving median

Thursday, July 26, 2018

window - Moving average vs. Moving median

I have read in many places that Moving median is a bit better than Moving average for some applications, because it is less sensitive to outliers.

I wanted to test this assertion on real data, but I am unable to see this effect (green: median, red: average). See here:

Here's the sample audio data test.wav. Here's the Python code:

import numpy as np

from scipy.io.wavfile import read
import matplotlib.pyplot as plt

def median(lst): return np.median(np.array(lst))
def mean(lst): return sum(lst)/len(lst)

(fs, x) = read('test.wav')

x = abs(x)
env = np.zeros_like(x)

env2 = np.zeros_like(x)

for i in range(len(x)):
    env[i] = median(x[max(i-1000,0):i+1])
    env2[i] = mean(x[max(i-1000,0):i+1])

plt.plot(range(len(x)), env, color = 'green')
plt.plot(range(len(x)), env2, color = 'red')
plt.show()

I have tried with various values for Window width (here in the code : 1000), and it was always the same: the moving median is not better than moving average (i.e. not less sensitive to outliers).

The same with Window width = 10000 (10000 >> the spike width) :

Can you provide an example showing that moving median is less sensitive to outliers than moving average? And if possible using the sample .WAV file data-set (download link).

i.e. is it possible to do a moving median on this data such that the result is like this yellow curve? (i.e. no more spike!)

Notes

Thursday, July 26, 2018

window - Moving average vs. Moving median

No comments:

Post a Comment

periodic trends - Comparing radii in lithium, beryllium, magnesium, aluminium and sodium ions