Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Runtime process stopped
Hi,
And I have a question ??? :
My computer is running a process there are 40 hours, did not show the warning that popped the memory capacity. However the runtime process does not go continuously (without moving time is about 3 hours and then updates the time, so I know that is already running for 40 hours).
Does RapidMiner process is running?
Worth leaving more time running?
Thank you!
And I have a question ??? :
My computer is running a process there are 40 hours, did not show the warning that popped the memory capacity. However the runtime process does not go continuously (without moving time is about 3 hours and then updates the time, so I know that is already running for 40 hours).
Does RapidMiner process is running?
Worth leaving more time running?
Thank you!
0
Answers
On which kind of machine are you running your process? How much memory is allocated to RapidMiner? Which kind of data are you processing, and which operators are you using?
Best regards,
Marius
Thanks for the reply.
I'm seeing rather the system monitor. The line memory allocation is constant maximum. When time updates (it is generally stationary, and walk for a few minutes) the line allocation decreases slightly. However, the numbers that appear
Max = 3.9 GB
Total = 3.9 GB
never change. Since the beginning of the process did not change these numbers.
My machine is 4 GB and 64-bit. I am running the fp-growth followed by create association rule, with 115 000 records and 26 attributes (which after transformation to become binomial 120).
At first, when I put the process to run, it worked normally (time walked steadily and line allocation did not reach the maximum). However, already have many hours time only performs jumps (the process is running for 4 days).
Thanks for the help! ;D
You may have too many attributes for the association rule operator, which takes the frequent itemsets found by FPGrowth and generates rules. If you look at the source code you'll see that it tries to generate powersets over which it will iterate. Naturally this only works for very small numbers of attributes. Here's a previous exchange on this very subject....
http://rapid-i.com/rapidforum/index.php/topic,3619.0.html
You could try putting a break after FPGrowth, if it gets to there successfully then your delay is probably caused by this powerset issue.
Best wishes
Thanks for the reply ...
I read the link you passed. And really this should be my problem.
Until all FP-Growth occurs normally and takes a maximum of 8 minutes to generate a set of common items. I put a breakpoint and FP-Growth generated:
No. of Sets: 11603654
Maximum Size: 16
And when I put the operator Create Association Rule the process can not be completed. I've tried on other machines (with a bit more memory) and the same problem occurs.
I saw through your link that the solution would be: increase min_sup or restrict Max_Items. The problem is that I can not increase the min_sup because I have interest in a value of an attribute as a conclusion (this attribute value appears only in 20% of my data). And already asked here in the forum and I was told that there is a way to put an attribute to be the conclusion of the rule (the way would be through the must_contain, but only works on FP-Growth can not use the operator in the sequence Create Association Rule) . And in my case need to get the rules.
The second alternative to the problem is that I do not know how can I restrict Max_Items (do not know what value should I put ideal).
Given our discussion of this, I believe that it is useless to let the process run for all these days, I will not get a result. Consequently, the process can be aborted.
You have another alternative for me to solve the problem?
Thank you for your help!
If you increase the support floor the number of item sets will go down, but you could still end up with long item sets, and as we can see, the maths of powersets means that you may need to wait for a long time with itemsets of size sixteen.
If on the other hand you limit the itemset size you can avoid this danger, but only at the expense of missing some possibly important pattern strings of longer length.
In your position I'd set the itemset size low (3-4), and gradually build it up. Given the exponential entwine you may get interesting stuff at 6, but never finish on 7. Let the data talk!
Best wishes,
H