On 07/08/2013 10:33 AM, Daniel J Sebald wrote:
On 07/08/2013 12:29 PM, Daniel J Sebald wrote:
> On 07/08/2013 11:38 AM, Rik wrote:
>> 7/8/13
>>
>> Dr. Klein,
>>
>> I think this is actually a much easier problem to solve that it first
>> appeared. In the the file binocdf.m the formula used to calculate the
>> CDF is
>>
>> cdf(k) = 1 - betainc (p, tmp + 1, n - tmp);
>>
>> According to Wikipedia
>> (http://en.wikipedia.org/wiki/Binomial_distribution) the CDF for the
>> binomial distribution is
>>
>> \textstyle I_{1-p}(n - k, 1 + k)
>>
>> or
>>
>> I(1-p, n-k, 1+k)
>>
>> So it appears that we simply have the arguments wrong to the betainc
>> function.
>
> But it is also true that
>
> \textstyle I_{x}(a, b) = I_{1-x}(b, a)
Oops, make that
I_{x}(a,b) = 1 - I_{1-x}(b,a)
Dan
>
http://en.wikipedia.org/wiki/Regularized_incomplete_beta_function#Incomplete_beta_function
>
>
> In other words, the two expressions you are comparing are mathematically
> equivalent. Where is the discrepancy then? Numerical issues?
Yes, there is a mixing of large and small values such that the small
value is swamped by the limited precision of the double type.
The example binocdf (1,50, 0.999) translates to
1 - betainc (.999, 2, 49) => 1 - 1 => 0
If we don't use the mathematical identity to swap A and B in the inputs
to betainc then the expression is
betainc (.001, 49, 2) => 4.99510000000012e-146
which is correct.
It turns out that the Fortran code in xdbetai.f is making use of the
same identity. Check out this code at the start of the function which
swaps A and B and reverses x to 1-x.
Y = X
P = PIN
Q = QIN
IF (Q.LE.P .AND. X.LT.0.8D0) GO TO 20
IF (X.LT.0.2D0) GO TO 20
Y = 1.0D0 - Y
P = QIN
Q = PIN
At the end of the function it sorts things back out again
70 IF (Y.NE.X .OR. P.NE.PIN) DBETAI = 1.0D0 - DBETAI
So, when betainc is called as betainc (.999, 2, 49) it reverses the
arguments and eventually reaches line 70 where it takes 1.0 - 4.995e-146
which equals 1.