# "[SOLVED] using R in RapidMiner: how to use data from dataset in test in R"

Hello everybody,

I am new on the forum. The posts about installing R for RapidMiner have been very helpfull already. So thank you for that.

I am trying a simple analysis with R in RM to get myself started. Now I have run into some problems. I hope someone here can point me in the right direction.

I made the following R-script to do a paired t-test in RM:

x1 <- c(1,2,5,7,9,0)

x2 <- c(2,3,4,3,6,4)

mytest.t <- t.test(x1, x2, paired=T, alternative="less")

pvalue <- mytest.t$statistic

result <- as.data.frame(pvalue)

that works fine.

When I try to run the test on some data from one of the databases I get an error. In this script inputdata[2] and inputdata[6] are numeric columns from the dataset, which are otherwise accesible in the script (e.g. result <- as.data.frame(x1) will give the column as result)

The script giving the error:

x1 <- inputdata[2]

x2 <- inputdata[6]

mytest.t <- t.test(x1, x2, paired=T, alternative="less")

pvalue <- mytest.t$statistic

result <- as.data.frame(pvalue)

gives the following error:

Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): Error in `[.data.frame`(y, yok) : undefined columns selected

Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): In addition: Warning message:

Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): package 'mlr' is not available (for R version 2.15.2)

Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): Error: object 'result' not found

What am I doing wrong?

thank you for reading and any help in advance.

best regards,

Arjan

I am new on the forum. The posts about installing R for RapidMiner have been very helpfull already. So thank you for that.

I am trying a simple analysis with R in RM to get myself started. Now I have run into some problems. I hope someone here can point me in the right direction.

I made the following R-script to do a paired t-test in RM:

x1 <- c(1,2,5,7,9,0)

x2 <- c(2,3,4,3,6,4)

mytest.t <- t.test(x1, x2, paired=T, alternative="less")

pvalue <- mytest.t$statistic

result <- as.data.frame(pvalue)

that works fine.

When I try to run the test on some data from one of the databases I get an error. In this script inputdata[2] and inputdata[6] are numeric columns from the dataset, which are otherwise accesible in the script (e.g. result <- as.data.frame(x1) will give the column as result)

The script giving the error:

x1 <- inputdata[2]

x2 <- inputdata[6]

mytest.t <- t.test(x1, x2, paired=T, alternative="less")

pvalue <- mytest.t$statistic

result <- as.data.frame(pvalue)

gives the following error:

Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): Error in `[.data.frame`(y, yok) : undefined columns selected

Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): In addition: Warning message:

Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): package 'mlr' is not available (for R version 2.15.2)

Nov 14, 2012 2:38:13 PM INFO: Execute Script (R): Error: object 'result' not found

What am I doing wrong?

thank you for reading and any help in advance.

best regards,

Arjan

Tagged:

0

## Answers

458UnicornHere's an example that might help.

http://rapidminernotes.blogspot.co.uk/2011/06/counting-clusters-part-r.html

regards

Andrew

2Contributor IIn the end this was my R-script:

var1 <- inputdata$positionInfoSpeed

var2 <- inputdta$positionInfoSpeed2

mytest.t <- t.test(var1, var2, paired=T, alternative="less")

result_table <- sapply(mytest.t,unlist)

result <- as.data.frame(result_table)

thank you!