🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉
Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.
"R Extension VS RM Memory Sharing Problems"
I have spent a few weeks learning R to developer a simple IQR normalization script for use within RM. I can successfully apply this script to example sets that contain 60k examples with 765 attributes (350MB) only ONCE. If I try to run the process again R runs out of memory and complains:
"Oct 27, 2010 2:21:55 PM SEVERE: IQR Norm: Error: cannot allocate vector of size 360.7 Mb"
It seems that RM is allocating too much unused memory from the system (Windows 7 x64 Ultimate). During the first run there is enough free memory available to run the R script. On subsequent runs R cannot allocate memory because RM has used around 6.95Gb / 8Gb which leaves around 12Mb free when taking into account the OS an other apps.
So what we need is a way to easily control how much memory we allow RM to take up.
Or we need to somehow ENABLE active garbage collection, where as soon as a process has finished executing the closed results are freed from memory.
Here is the simple script:
memory.limit(8*4000)I increase memory requirements to begin, then run the script on each column of the example set, then print any warnings.
y <- as.data.frame(apply(x,2,function(col) (col-median(col))*1.349/IQR(col)))
This works the first time, and then not again until I restart RM. The worst part is that when I set my RM to use max 4Gb it still commits more than 7Gb!