Wednesday, May 17, 2017

fft - Spectrogram with square or non-square magnitude of STFT: power vs. magnitude


As seen in this question and answer, to do a spectrogram, it's common to plot either:



  • the square magnitude $|\text{STFT}(\text{frame}, \text{bin})|^2$ ("power spectrum")

  • the magnitude $|\text{STFT}(\text{frame}, \text{bin})|$


In an audio context, when to choose the one or the other? For which reason is one more useful than the other?


In the context of noise reduction, there are both Magnitude Spectrum Subtraction or Power Spectrum Subtraction, as seen around page 5-8 in http://dsp-book.narod.ru/304.pdf. When to use the one or the other?


Also, what do most audio editors display? spectrum or power spectrum?





Answer



In the image you gave, the right-hand scale is in decibels (dB). So, essentially, the square turns into an affine scaling in the logarithm domain, which essentially yields the same image, at least the same relative dynamic range. So, in that case, it does not matter.


Aside, $x \to x^\alpha$ transformations with $x\ge 0$ (a positive signal, a magnitude spectrum, a spectrogram) are customary when signals have different orders of magnitude that cannot be caught by human eyes or ears: $\alpha > 1$ puts emphasis on the largest components, $\alpha < 1$ relatively enhances smaller values, which is useful when you want to detect low-amplitude signals. Depending on whether you want to capture audio fingerprints, or perform subtle audio forensics, one or the other should be chosen. Such $\alpha$-power transformation are also useful in your other spectral subtraction question, What is the name of this very simple spectral subtraction technique?


No comments:

Post a Comment

periodic trends - Comparing radii in lithium, beryllium, magnesium, aluminium and sodium ions

Apparently the of last four, $\ce{Mg^2+}$ is closest in radius to $\ce{Li+}$. Is this true, and if so, why would a whole larger shell ($\ce{...