|
Hi there,
I want to apply a specific kernel approach for estimating the cdf. I am interested in implemeting the so called Altman approach mentioned in section 2.3 of the following paper
Comparison of bandwidth selection methods for kernel
smoothing of ROC curves, by Xiao-Hua Zhou and Jaroslaw Harezlak
STATISTICS IN MEDICINE
Statist. Med. 2002; 21:2045–2055 (DOI: 10.1002/sim.1156)
This seemed an easy one. And I build it very carefully and checked it over and over again, but probably I am doing something wrong. I provide the code that implements this approach. I first generate some normally distributed data and then plot my estimation along with the true cdf to see how am I doing. I use a large sample to check it.
I observed that if I replace the bandwidth value used in the following code, and mentioned in the paper:
hx=(45/(-7*n*ps))^(0.3)
with this one:
hx=(45/(-7*n^2*ps))^(0.3)
everything works fine! However this is not the bandwidth mentioned in the paper, just saying.. Can anyone see something wrong?
I wish I could upload tha paper too because I guess special access is required but I don't think I am allowed to do that due to copyrights. Let alone I don't know if I can upload anything here. Can I?
close all
clear all
clc
n=2000;
L2=@(x) (x.^2./exp(x.^2/2) - 1./exp(x.^2./2)) .* 1/(sqrt(2*pi))
%The above is the second derivative of a standard normal density
sigma=0.9;mu=1.5;
g=(n)^(-0.3);
x=normrnd(mu,sigma,1,n);
[b a]=meshgrid(1:length(x),1:length(x));
SS=sum(sum(L2((x(a)-x(b))./(g))));
ps=(1/(n^2))*SS;
hx=(45/(-7*n*ps))^(0.3) %This is what the paper mentions
%hx=(45/(-7*n^2*ps))^(0.3) %This seems to work fine
W=@(t) 0*(t<-1)+(3/4.*t-1/4.*t.^3+1/2).*(abs(t)<=1)+1.*(t>1)
%The above is the Epanechnikov kernel for cdf
Fhat=@(t) 1/(n)*sum(W((t-x)./hx)) % estimate of the cdf
%--Now draw the estimated CDF
kk=1;
gr=-4:0.1:4;
for i=gr
FF(kk)=Fhat(i);
kk=kk+1;
end
plot(gr,FF,'r')
hold on
%---And draw also the true CDF:
plot(gr,normcdf(gr,mu,sigma))
|