Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
what's the limit of amont of data that RapidMiner can handle?
Hi,
The number of my examples is 10530 and the number of the attributes is about 334. I am running RapidMiner on my desktop. However, it kept freezing. So, I would like to know whether this is due to the limit of my desktop's memory or it is beyond RapidMiner's ability to handle this amount of data.
Also, if RapidMiner crashes, is there any way I could retrieve the part of results that RapidMiner has processed?
Thanks a lot!
Xia
The number of my examples is 10530 and the number of the attributes is about 334. I am running RapidMiner on my desktop. However, it kept freezing. So, I would like to know whether this is due to the limit of my desktop's memory or it is beyond RapidMiner's ability to handle this amount of data.
Also, if RapidMiner crashes, is there any way I could retrieve the part of results that RapidMiner has processed?
Thanks a lot!
Xia
0
Answers
If we don't know how much memory is available, and what the processes are, then it's hard to say.
As an example, I run a data table with 30+ million records, each a timestamped and categorised four reals row, in an SQL database, on a double quad Dell with 16 GB of memory, running Ubuntu 11.1. I do association rule mining on that database by using RapidMiner to process my reals into binominals, and by passing those binominals to an extension that is purpose built for association matching.
So I think the answer is that RM can handle the data, if you give it the resources; but you may not have the time to wait for the answers!
I am throwing my data though SVM. It crashed after several hours without writing anything to the log file. I will try to avoid the large table output.
Is your operators producing output from selected nodes given the huge data processing invoved instead of having issues in single table?