I have a floating-point digital signal processing system that operates at a fixed sample rate of $f_s = 32768$ samples per second implemented using an x86-64 processor. Assuming that the DSP system is synchronously locked to whatever matters, what is the best way to implement a digital oscillator at some frequency $f$?
Specifically, I want to generate the signal: $$y(t) = \sin(2\pi f t)$$ where $t=n/f_s$ for sample number $n$.
One idea is to keep track of a vector $(x,y)$ which we rotate by an angle $\Delta\phi = 2\pi f/f_s$ on each clock cycle.
As a Matlab pseudocode implementation (the real implementation is in C):
%% Initialization code
f_s = 32768; % sample rate [Hz]
f = 19.875; % some constant frequency [Hz]
v = [1 0]; % initial condition
d_phi = 2*pi * f / f_s; % change in angle per clock cycle
% initialize the rotation matrix (only once):
R = [cos(d_phi), -sin(d_phi) ; ...
sin(d_phi), cos(d_phi)]
Then, on each clock cycle, we rotate the vector around a little bit:
%% in-loop code
while (forever),
v = R*v; % rotate the vector by d_phi
y = v(1); % this is the sine wave we're generating
output(y);
end
This allows the oscillator to be computed with only 4 multiplications per cycle. However, I'd worry about phase error and amplitude stability. (In simple tests I was surprised that the amplitude did not die or explode immediately--maybe the sincos
instruction guarantees $\sin^2+\cos^2=1$?).
What is the right way to do this?
Answer
You're right that the strictly recursive approach is vulnerable to the accumulation of error as the number of iterations increases. One more robust way this is typically done is to use a numerically-controlled oscillator (NCO). Basically, you have an accumulator that keeps track of the instantaneous phase of the oscillator, updated as follows:
$$ \delta = \frac{2\pi f}{f_s} $$
$$ \phi[n] = (\phi[n-1] + \delta) \mod 2\pi $$
At each time instant, then, you're left with converting the accumulated phase in the NCO to the desired sinusoidal outputs. How you do this depends on your requirements for computational complexity, accuracy, etc. One obvious way is to just calculate the outputs as
$$ x_c[n] = \cos(\phi[n]) $$
$$ x_s[n] = \sin(\phi[n]) $$
using whatever implementation of sine/cosine you have available. In high-throughput and/or embedded systems, the mapping from phase to sine/cosine values is often done via a lookup table. The size of the lookup table (i.e. the amount of quantization you do on the phase argument to sine and cosine) can be used as a tradeoff between memory consumption and approximation error. The nice thing is that the amount of computations required is typically independent of the table size. In addition, you can limit your LUT size if needed by taking advantage of the symmetry inherent in the cosine and sine functions; you only really need to store one-fourth of a period of the sampled sinusoid.
If you need higher accuracy than a reasonably-sized LUT can give you, then you can always look at interpolation between table samples (linear or cubic interpolation, for example).
Another benefit of this approach is that it is trivial to incorporate frequency or phase modulation with this structure. The frequency of the output can be modulated by varying $\delta$ accordingly, and phase modulation can be implemented by simply adding to $\phi[n]$ directly.
No comments:
Post a Comment