Discussion:
Can IDL calculate the confidence level about correlation
(too old to reply)
Lin Wang
2007-11-23 01:24:50 UTC
Permalink
Dear All,
I've already know how to calculate the critical Pearson correlation
coefficient (Rc) for a determined confidence level (for example 0.01).
In this approach, I set a confidence level and judge whether the
correlation coefficient R reaches this level.
Now I wonder can I calculate the confidence level directly from the
correlation coefficient R. In this approach, I can directly see the
confidence level for the known R. This way seems a little lazy, but
may be more convinent.
Could anyone help?
Thank you!

Lin
Lin Wang
2007-11-23 01:51:28 UTC
Permalink
I found a code which can calculate the confidence level using pvalue
method (see below). It works well, but usually I use the Student's t-
test method. So can anyone help?

Thanks!

r=correlate(x,y)
var=1/(n-3.0)
zvalue = 0.5*alog((1+r)/(1-r))/sqrt(var)
if (zvalue lt 0) then pvalue = 2*(gauss_pdf(zvalue)) $
else pvalue = 2*(1-gauss_pdf(zvalue))
Yaswant Pradhan
2007-11-23 12:06:09 UTC
Permalink
Not exactly what you want... but this might give you some idea.
;+
; PRO: sig_test
; Significance Test of correlation for a sample size = N
;
; N: Number of observations
; r: Correlation coefficient
; Pr: Probaility that random noise could produce the result
(correlation) with N samples
; Pr=ERFC(r*sqrt(N/2))
; ERFC: Complementary Error Function
; rsig: At which we have 100*(1-limit) chance that random data would
produce this result (r)
; rsig=INVERF(limit)*sqrt(2/N)

; Interpretation:
; Any "r" value greater than "rsig" are significant at "limit*100"
level
; Modification: Yaswant Pradhan 10/6/05
;-



PRO sig_test
r=0D
N=0D
Pr=0D
rsig=0D
i=0.99

input: read,'Input Correlation coefficint and Number of samples [r,
N]',r,N
if (r gt 1.0 or r lt -1.0) then begin
err=widget_message('r value should be between -1.0 and 1.0. Input
r, N again!',/Error)
goto, input
endif

r=abs(r)
Pr= ERFC(r*sqrt(N/2.))

while (inverf(i)*sqrt(2./N) gt r) do begin
i=i-0.01
rsig=inverf(i)*sqrt(2./N)
endwhile

print,'Correlation Significance Test Result:'
print,'====================================='
print,'Correlation coefficient: ',r
print,'Number of samples: ',N
print,'Confidence Limit: '+string(9b)+string(i*100,format='(i2.2)')
+'%'
print,'Probability (Pr): ',Pr
print,'====================================='
END
Post by Lin Wang
I found a code which can calculate the confidence level using pvalue
method (see below). It works well, but usually I use the Student's t-
test method. So can anyone help?
Thanks!
r=correlate(x,y)
var=1/(n-3.0)
zvalue = 0.5*alog((1+r)/(1-r))/sqrt(var)
if (zvalue lt 0) then pvalue = 2*(gauss_pdf(zvalue)) $
else pvalue = 2*(1-gauss_pdf(zvalue))
Vince Hradil
2007-11-23 21:02:01 UTC
Permalink
Post by Lin Wang
I found a code which can calculate the confidence level using pvalue
method (see below). It works well, but usually I use the Student's t-
test method. So can anyone help?
Thanks!
r=correlate(x,y)
var=1/(n-3.0)
zvalue = 0.5*alog((1+r)/(1-r))/sqrt(var)
if (zvalue lt 0) then pvalue = 2*(gauss_pdf(zvalue)) $
else pvalue = 2*(1-gauss_pdf(zvalue))
I think THIS is the way to do it. The t-test is irrelevant.
Lin Wang
2007-11-24 01:20:38 UTC
Permalink
Vince,

Yes, p-value test can test the significance of correlatioins, but t-
test is also widely used in meteorological studies, even more popular
than p-value test I think.

The probability density fuction for correlation coefficient r is:
f(r)=gama((n-1)/2)*(1-r*r)**(n/2-2)/sqrt(pi)/gama((n-2)/2)

Set r=t/sqrt(n-2)/sqrt(1+t*t/(n-2)), and v=n-2, then

f(r)dr=(after some transformation)=[1/sqrt(v*pi)]*[gama((v+1)/2)/
gama(v/2)]*[(1+t*t/2)**(-(v+1)/2)]*dt
this is the probability density function for t distribution, so the
significance of r can be evaluated by the t-test.
Lin Wang
2007-11-24 01:30:26 UTC
Permalink
Post by Vince Hradil
Post by Lin Wang
I found a code which can calculate the confidence level using pvalue
method (see below). It works well, but usually I use the Student's t-
test method. So can anyone help?
Thanks!
r=correlate(x,y)
var=1/(n-3.0)
zvalue = 0.5*alog((1+r)/(1-r))/sqrt(var)
if (zvalue lt 0) then pvalue = 2*(gauss_pdf(zvalue)) $
else pvalue = 2*(1-gauss_pdf(zvalue))
I think THIS is the way to do it. The t-test is irrelevant.
Vince,

Yes, p-value test can test the significance of correlatioins, but t-
test is also widely used in meteorological studies, even more popular
than p-value test I think.


The probability density fuction for correlation coefficient r is:
f(r)=gama((n-1)/2)*(1-r*r)**(n/2-2)/sqrt(pi)/gama((n-2)/2)


Set r=t/sqrt(n-2)/sqrt(1+t*t/(n-2)), and v=n-2, then


f(r)dr=(after some transformation)=[1/sqrt(v*pi)]*[gama((v+1)/2)/
gama(v/2)]*[(1+t*t/2)**(-(v+1)/2)]*dt
this is the probability density function for t distribution, so the
significance of r can be evaluated by the t-test.

Lin
Mike
2007-11-23 23:25:00 UTC
Permalink
Be careful of correlation coefficients! Are you familiar with
"Anscombe correlation examples"? Interesting data set - should be
able to google it up, or find it in Amer Stat 27 (1973) 17.
Lin Wang
2007-11-24 01:28:16 UTC
Permalink
Post by Mike
Be careful of correlation coefficients! Are you familiar with
"Anscombe correlation examples"? Interesting data set - should be
able to google it up, or find it in Amer Stat 27 (1973) 17.
Mike,

Thanks for your advice. I've found the example you mentioned on the
web via http://www.tufts.edu/~gdallal/anscombe.htm. It's really
interesting! I will be very careful with the results. Thanks again.

Lin
Brian Larsen
2007-11-24 14:25:04 UTC
Permalink
If you want to see this graphically I have some code here:
http://people.bu.edu/balarsen/Home/IDL/Entries/2007/11/5_Regression_with_confidence_bands.html
to display the regression fit and the confidence bands around the
fit. I just noticed there is no reference on the page, I will add one
when I am in the office next week.



Cheers,

Brian

--------------------------------------------------------------------------
Brian Larsen
Boston University
Center for Space Physics

Loading...