what's the limit of amont of data that RapidMiner can handle?

xiazhang3xiazhang3 Member Posts: 11 Contributor II
edited November 2018 in Help
Hi,

The number of my examples is 10530 and the number of the attributes is about 334. I am running RapidMiner on my desktop. However, it kept freezing. So,  I would like to know whether this is due to the limit of my desktop's memory or it is beyond RapidMiner's ability to handle this amount of data.

Also,  if RapidMiner crashes, is there any way I could retrieve the part of results that RapidMiner has processed?

Thanks a lot!

Xia

Answers

  • xixirhwfyxixirhwfy Member Posts: 9 Contributor II
    Hi,I have the same problem...If you get some solutions, could you please tell me? Thanks a lot.
  • haddockhaddock Member Posts: 849 Maven
    Hi,

    If we don't know how much memory is available, and what the processes are, then it's hard to say.

    As an example, I run a data table with 30+ million records, each a timestamped and categorised four reals row, in an SQL database, on a double quad Dell with 16 GB of memory, running Ubuntu 11.1. I do association rule mining on that database by using RapidMiner to process my reals into binominals, and by passing those binominals to an extension that is purpose built for association matching.

    So I think the answer is that RM can handle the data, if you give it the resources; but you may not have the time to wait for the answers!
  • xixirhwfyxixirhwfy Member Posts: 9 Contributor II
    I solved my problem just now. The reason is not the limitation of memory, but the table size in the result is too big(my result is 6500*6500 tupples with 50 attributes ). I cancelled the operator used to show the large table, and only display other results, then the results come out successfully. So maybe you also tried to display result in very large tables. Try to disable the large table.
  • xiazhang3xiazhang3 Member Posts: 11 Contributor II
    Thanks a lot for your inputs!

    I am throwing my data though SVM. It crashed after several hours without writing anything to the log file. I will try to avoid the large table output.
  • abehera1992abehera1992 Member Posts: 3 Contributor I

    Is your operators producing output from selected nodes given the huge data processing invoved instead of having issues in single table?

Sign In or Register to comment.