This is my snoring detector again.
I've gotten pretty good at detecting a signal when there's anything there -- can track from a wall-peeling snore down to breathing you can't even hear in the recording. The problem is, I can't tell when the signal has dropped below detectable level and the app is just "hearing things". And, unfortunately, snoring/breathing is often irregular enough that a simple autocorrelation or similar interval timing scheme is unlikely to help much. (And it's actually likely that in some cases the noise is more regular than the breathing.)
So, are there any tricks I'm missing for figuring out when there is no signal? It kind of seems that I'm up against a hard place here, given the "signal" is so noise-like to begin with.
(And maybe this is related to another problem I'm having: Strangely, I can't accurately (or even approximately) measure the signal level even when fairly loud. Since I need to use rolling averages and ratios to detect the signal anyway, the level information kind of gets lost. I'm looking for some tricks to reconstitute it.)
Basic technique
(For Yoda)
The audio signal is sampled (generally at 8000Hz, for various reasons), then FFTed in 1024 blocks. (In my experiments Hamming filters and overlapping blocks seem to have little effect, though those may be revisited later.)
The FFT is divided into "bands" (currently 5, slightly skewed in size to place more detail on the low end) and the "spectral difference" and level of each band is summed. Long-term averages of the peak-limited values are used as "thresholds", and further bias adjustments are used to maintain a roughly 20% "over threshold" rate.
Each "over threshold" value is given a weight of 1 (under threshold is given a weight of 0), but then that weight is adjusted by the apparent "variability" (at roughly 2Hz) in the band, to give more weight to bands that carry more apparent signal.
The weights of the bands are summed and then the summed weights of subsequent blocks are summed over about a second to produce a running "score". This is again compared to a running average threshold (plus several heuristics) to detect snore onset/offset.
Update
It suddenly occurred to me that if my algorithm effectively maintains a constant-level signal (per my signal level problem), the way to effectively gauge SNR is by measuring the noise when there's no signal.
Conveniently, snores are intermittent, with lots of "dead air" in-between. And I'm already detecting the snore envelopes. So anything outside of the envelope (between the end of one snore and the start of the next) is presumably noise! This I can (with some modest degree of accuracy/repeatability) measure. (It took three tries to come up with a halfway decent algorithm, of course -- reality never matches the theory.)
So I don't have the full answer yet, but I've made progress.
(While the above technique gives me a fairly good proxy for SNR, I'm still having trouble estimating actual signal level. My "relative level" indications can be off the scale for a barely audible breath and so-so for a window rattler. I need some sort of proxy for absolute level.)
No comments:
Post a Comment