# "FDistribution.java calculating p-Value mathematical explanations"

Hello,

I am currently writing a thesis at my university about statistical methods and data mining and I am focussing on the student's t-test. I am new to statistics and do not really know much about distributions or the calculation of p-values, which is why I need help to understand how the t-test in rapidminer works.

I am using RapidMiner 5 to perform a t-test and I looked at the source code of the t-test operator and found out that the calculation of the p-value happens within the Operator "FDistribution.java".

This is where I need help. I understand that the squared t-value of a student's t-distribution with t(m,n) is equal to the F-value of an F-Distribution F(1,m+n-2)

but after that I cannot really follow the mathematical steps that are performed within the FDistribution operator.

could someone maybe take me through the operator step-by-step and explain why within the method "betaInverse" the inverse beta function is used?

what exaclty is the variable "beta" in there and why is it calculated using the logarithm?

and after that, why exactly is this part needed?

double psq = p + q;

double cx = 1 - x1;

double x2 = Double.NaN;

double pp = Double.NaN;

double qq = Double.NaN;

boolean index;

if (p < psq * x1) {

x2 = cx;

cx = x1;

pp = q;

qq = p;

index = true;

} else {

x2 = x1;

pp = p;

qq = q;

index = false;

}

and what exactly is this part of code doing?

while (temp > acu && temp > acu * betain) {

term = term * temp * rx / (pp + ai);

betain = betain + term;

temp = Math.abs(term);

if (temp > acu && temp > acu * betain) {

ai++;

ns--;

if (ns >= 0) {

temp = qq - ai;

if (ns == 0)

rx = x2;

} else {

temp = psq;

psq += 1;

}

}

and at the end, where does this formula coem from?

betain *= Math.exp(pp * Math.log(x2) + (qq - 1) * Math.log(cx) - beta) / pp;

I just have absolutely no idea what's happening here so I would really appreciate some sort of help or explanation. Or maybe some books or other references whre I can find the used mathematical relationships used here, then maybe I can read those books myself and begin to understand how all of this works...

I am currently writing a thesis at my university about statistical methods and data mining and I am focussing on the student's t-test. I am new to statistics and do not really know much about distributions or the calculation of p-values, which is why I need help to understand how the t-test in rapidminer works.

I am using RapidMiner 5 to perform a t-test and I looked at the source code of the t-test operator and found out that the calculation of the p-value happens within the Operator "FDistribution.java".

This is where I need help. I understand that the squared t-value of a student's t-distribution with t(m,n) is equal to the F-value of an F-Distribution F(1,m+n-2)

but after that I cannot really follow the mathematical steps that are performed within the FDistribution operator.

could someone maybe take me through the operator step-by-step and explain why within the method "betaInverse" the inverse beta function is used?

what exaclty is the variable "beta" in there and why is it calculated using the logarithm?

and after that, why exactly is this part needed?

double psq = p + q;

double cx = 1 - x1;

double x2 = Double.NaN;

double pp = Double.NaN;

double qq = Double.NaN;

boolean index;

if (p < psq * x1) {

x2 = cx;

cx = x1;

pp = q;

qq = p;

index = true;

} else {

x2 = x1;

pp = p;

qq = q;

index = false;

}

and what exactly is this part of code doing?

while (temp > acu && temp > acu * betain) {

term = term * temp * rx / (pp + ai);

betain = betain + term;

temp = Math.abs(term);

if (temp > acu && temp > acu * betain) {

ai++;

ns--;

if (ns >= 0) {

temp = qq - ai;

if (ns == 0)

rx = x2;

} else {

temp = psq;

psq += 1;

}

}

and at the end, where does this formula coem from?

betain *= Math.exp(pp * Math.log(x2) + (qq - 1) * Math.log(cx) - beta) / pp;

I just have absolutely no idea what's happening here so I would really appreciate some sort of help or explanation. Or maybe some books or other references whre I can find the used mathematical relationships used here, then maybe I can read those books myself and begin to understand how all of this works...

Tagged:

0