Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Out of memory while running SVM
Hello,
I am running SVM on data that has more than 28,000 plus rows and 300 or so attributes. I have 1 GB RAM and around 4 GB swap memory on my Fedora 17 Linux box. I also tried the solution on another Fedora system that has 1 GB RAM and 20 GB swap memory, with the same result.
I am getting the following error related to out of memory.
"This process would need more than the maximum amount of available memory. You can either leave the process as it is and use a computer with more memory, reduce the amount of data by one of the sampling operators, optimize the process by using other learning or preprocessing schemes, or directly work on database systems, eg. by using the cached database example source operators."
I am thinking of using Radoop and I have written to them, requesting help. But in the meanwhile, is there any way that I can solve the problem using the current hardware and also by not reducing data, as data is highly irregular, so k-NN models are playing an important role in out of sample prediction.
I am particularly interested in the last phrase in the above message related to cached database example source operators. I have no idea of what that means. Could anyone please point out to me what element is it in Rapidminer and any example on how to use it.
Before running SVM, I had to run Random Forest on the data, but encountered the same out of memory issue.
Any suggestions, hints, lessons learned, tutorials, and documentation of how to solve this problem and dealing with lot of data would be highly helpful as I have worked with relatively small amount of data, using Rapidminer on my PC.
Thanks,
Ajay
I am running SVM on data that has more than 28,000 plus rows and 300 or so attributes. I have 1 GB RAM and around 4 GB swap memory on my Fedora 17 Linux box. I also tried the solution on another Fedora system that has 1 GB RAM and 20 GB swap memory, with the same result.
I am getting the following error related to out of memory.
"This process would need more than the maximum amount of available memory. You can either leave the process as it is and use a computer with more memory, reduce the amount of data by one of the sampling operators, optimize the process by using other learning or preprocessing schemes, or directly work on database systems, eg. by using the cached database example source operators."
I am thinking of using Radoop and I have written to them, requesting help. But in the meanwhile, is there any way that I can solve the problem using the current hardware and also by not reducing data, as data is highly irregular, so k-NN models are playing an important role in out of sample prediction.
I am particularly interested in the last phrase in the above message related to cached database example source operators. I have no idea of what that means. Could anyone please point out to me what element is it in Rapidminer and any example on how to use it.
Before running SVM, I had to run Random Forest on the data, but encountered the same out of memory issue.
Any suggestions, hints, lessons learned, tutorials, and documentation of how to solve this problem and dealing with lot of data would be highly helpful as I have worked with relatively small amount of data, using Rapidminer on my PC.
Thanks,
Ajay
Tagged:
0
Answers
Hi,
I have the same problem :
"This process would need more than the maximum amount of available memory. You can either leave the process as it is and use a computer with more memory, reduce the amount of data by one of the sampling operators, optimize the process by using other learning or preprocessing schemes."
Is there any idea?!!
I really appreciate any help you can provide.