I am trying to write an algorithm that would automatically segment a piece of audio with bird calls recordings. My input data are 1 minute-long wave files and on the output I would like to get separate calls for further analysis. Problem is that signal-to-noise ratio is quite terrible due to environmental conditions and poor quality of a microphone (mono, 8 kHz sampling).
I would be most grateful for any advice on how to proceed further with noise reduction.
Here is an example of my input, one minute audio recording in wave format: http://goo.gl/16fG8P
This is how the signal looks like:
Band-pass filtering, in which I am keeping only anything in between 1500 - 2500 Hz, does improve situation, but still it is far from expectations. In this spectrum still a lot of noise is present.
I have also plotted long-term (over 32-sample intervals) average energy and removed some clicks from it. Here is the result:
With all the remaining noise I have to set a very low threshold to the onset detection algorithm to pick last 10 seconds of bird calls. Problem is if I tweak it in such a way then in next recording I can get load of false positives.
Moving average filter helps a bit with wind noise. Any other ideas? I was thinking of "Spectral Subtraction", but here it seems to me I have sort of chicken and egg problem - to find noise-only area I have to segment the audio and to segment the audio I need to remove the noise. Do you know of any libraries that have this algorithm or some implementations in pseudo-code? Methinks Audacity uses such a method to remove noise. It is very effective, but it is left to the user to mark noise-only area.
I am writing in Python and it is a free, open-source project.
Thanks for reading!
No comments:
Post a Comment