Monday, March 6, 2017

Can someone show the details of how to apply AIC for sinusoidal models to specific data?




NOTE: This is a question that another user has been trying (unsuccessfully) to ask. Because the multiple questions asking, essentially, the same thing have either been deleted by me (because they were duplicates of each other) or by the user, I have tried to ask the question in such a way that will help the user and is a good fit for the site.




Suppose I have a vector of data measurements, $x(n)$, and I want to fit a model to that data: $$ x(n) = \sum_{p=1}^{P} a_p \cos(2\pi f_p + \phi_p) + \epsilon(n) $$ where each of the $P$ sinusoids is parameterized by $(a_p, f_p, \phi_p)$ and $\epsilon$ is an i.i.d. Gaussian, white noise process with zero mean and variance $\sigma^2$.



My question is: How do we use AIC (or BIC or MDL) to estimate the model order, $P$ ?


The Wikipedia entry just says: $$ AIC = 2P - 2\ln(L) $$ where $L$ is the likelihood function.


How do I calculate $L$ given $x(n)$?



Answer



The PDF (likelihood function) of the multi-tone estimation problem is: $$ {\cal L}({\bf x}; {\bf a}, {\bf f},{\bf \phi}, {\sigma}^2) = \frac{1}{(2\pi \sigma^2)^{\frac{N}{2}}} \exp\left( - \frac{1}{2\sigma^2} \sum_{n=0}^{N-1} ( x(n) - \sum_{p=1}^{P} {a}_p \cos(2\pi {f}_p + {\phi}_p) )^2 \right) $$


The paper by Djuric quotes the AIC and MDL values as:


enter image description here


The code at the end attempts to:



  • Generate a 5-tone noisy signal.


  • Do the ML estimates of the amplitudes, frequencies and phases for a given value of $p$.

  • Find the (log) likelihood values for each value of $p$.

  • Calculate the AIC and MDL values from Djuric.


The plot for $p = 1 \ldots 60$ is:


enter image description here


Usually, the AIC over-estimates the model order. In this particular case, AIC gets 7 and MDL gets 5 (the correct value). The minima of each curve are plotted (as a o on the AIC curve and as a + on the MDL curve).


Djuric, P.M., "A model selection rule for sinusoids in white Gaussian noise," Signal Processing, IEEE Transactions on (Volume:44 , Issue: 7 ), pp. 1744 - 1751.




scilab CODE ONLY BELOW



P = 5;
f = [ 0.1 0.12 0.231 0.333 0.478634];
phi = [0.329847 5.90324 3.09834 4.907983 1.984];
a = [ 1 1 2 2 3];
sigma = 1;

N = 100;

x = cos(2*%pi*[0:N-1]'*f + ones(N,1)* phi )*a';


xn = x + sigma*rand(N,1,'normal');


function [L,LL] = likelihood(data,ahat,fhat,phihat,sigmahat)
N = length(data);
datahat = cos(2*%pi*[0:N-1]'*fhat + ones(N,1)* phihat )*ahat';
L = 1/(2*%pi*sigmahat*sigmahat)^(N/2) * exp(-1/(2*sigmahat*sigmahat)*sum((data - datahat).^2));
LL = - log(2*%pi*sigmahat*sigmahat)*N/2 - -1/(2*sigmahat*sigmahat)*sum((data - datahat).^2);
endfunction


function [a,m] = aic(p,L,N)
a = 3*p - log(L);
m = 3*p/2*log(N) - log(L);
endfunction

function [ahat,fhat,phihat,regr] = sin_estimate(data)
PAD = 10;
N = length(data);
DATA = fft([data; zeros((PAD-1)*N,1)],-1);
[mx,ix] = max(abs(DATA(1:N*PAD/2,1)));

ahat = mx/(N/2);
fhat = (ix-1)/N/PAD;
phihat = atan(imag(DATA(ix)),real(DATA(ix)));

regr = data - cos(2*%pi*[0:N-1]'*fhat + ones(N,1)* phihat )*ahat
endfunction

pValues = [1:60];
likeValues = [];
loglikeValues = [];

aicValues = [];
mdlValues = [];
for p=pValues,
signal = xn;
a_est = zeros(1,p);
f_est = zeros(1,p);
phi_est = zeros(1,p);
for pp=1:p
[a_est(pp),f_est(pp),phi_est(pp),signal] = sin_estimate(signal);
end


[l,ll] = likelihood(xn,a_est,f_est,phi_est,sigma);

likeValues = [likeValues l];
loglikeValues = [loglikeValues ll];

[aicv,mdlv] = aic(p,likeValues(p));
aicValues = [aicValues; aicv]
mdlValues = [mdlValues; mdlv]
end


clf;
plot(aicValues);
plot(mdlValues,'r');
title('AIC and MDL plots');

No comments:

Post a Comment

periodic trends - Comparing radii in lithium, beryllium, magnesium, aluminium and sodium ions

Apparently the of last four, $\ce{Mg^2+}$ is closest in radius to $\ce{Li+}$. Is this true, and if so, why would a whole larger shell ($\ce{...