I have a set of songs for which I extracted the STFT (Short-Time Fourier Transform) and used the magnitude spectrum $|S|$ to calculate the mel spectrogram by using a mel filterbank matrix $M$, so $X=\log(M\times |S|)$. I want to know is there any method to reverse this process, i.e. convert from the mel spectrogram back to the spectrogram. I performed some dimensionality reduction on the mel spectrogram, and reconstructed the mel spectrogram from lower dimensions. Now I want to regenerate the audio signal from the reconstructed mel spectrogram, so I guess first reconstruct the spectrogram and then the audio signal.
The problem is that the mel filter bank matrix is not a square matrix, since we the reduce the no of frequency bins, so inverse of $M$ cant be used like this : $ \hat{S}=M^{-1}\exp(X)$. So is there any way to generate the inverse mapping, like some inverse transfer function that can convert from $X$ to $S$?
No comments:
Post a Comment