Options

Runtime process stopped

D_vidasD_vidas Member Posts: 12 Contributor II
edited November 2018 in Help
Hi,

And I have a question  ??? :
My computer is running a process there are 40 hours, did not show the warning that popped the memory capacity. However the runtime process does not go continuously (without moving time is about 3 hours and then updates the time, so I know that is already running for 40 hours).

Does RapidMiner process is running?
Worth leaving more time running?


Thank you!  :)

Answers

  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Can you see RapidMiner's System Monitor? How does the memory allocation line look like? If it is constantly at the max then your process probably run out of memory. If it is more ragged or did not even reach the maximum, then you probably simply need more time.

    On which kind of machine are you running your process? How much memory is allocated to RapidMiner? Which kind of data are you processing, and which operators are you using?

    Best regards,
    Marius
  • Options
    D_vidasD_vidas Member Posts: 12 Contributor II
    Hi Marius,

    Thanks for the reply.  :)

    I'm seeing rather the system monitor. The line memory allocation is constant maximum. When time updates (it is generally stationary, and walk for a few minutes) the line allocation decreases slightly. However, the numbers that appear
    Max = 3.9 GB
    Total = 3.9 GB
    never change. Since the beginning of the process did not change these numbers.

    My machine is 4 GB and 64-bit. I am running the fp-growth followed by create association rule, with 115 000 records and 26 attributes (which after transformation to become binomial 120).

    At first, when I put the process to run, it worked normally (time walked steadily and line allocation did not reach the maximum). However, already have many hours time only performs jumps (the process is running for 4 days).


    Thanks for the help!  ;D
  • Options
    haddockhaddock Member Posts: 849 Maven
    Hi there,

    You may have too many attributes for the association rule operator, which takes the frequent itemsets found by FPGrowth and generates rules. If you look at the source code you'll see that it tries to generate powersets over which it will iterate. Naturally this only works for very small numbers of attributes. Here's a previous exchange on this very subject....

    http://rapid-i.com/rapidforum/index.php/topic,3619.0.html

    You could try putting a break after FPGrowth, if it gets to there successfully then your delay is probably caused by this powerset issue.

    Best wishes
  • Options
    D_vidasD_vidas Member Posts: 12 Contributor II
    Hello Haddock,

    Thanks for the reply ...  :)

    I read the link you passed. And really this should be my problem.
    Until all FP-Growth occurs normally and takes a maximum of 8 minutes to generate a set of common items. I put a breakpoint and FP-Growth generated:
    No. of Sets: 11603654
    Maximum Size: 16
    And when I put the operator Create Association Rule the process can not be completed. I've tried on other machines (with a bit more memory) and the same problem occurs.

    I saw through your link that the solution would be: increase min_sup or restrict Max_Items. The problem is that I can not increase the min_sup because I have interest in a value of an attribute as a conclusion (this attribute value appears only in 20% of my data). And already asked here in the forum and I was told that there is a way to put an attribute to be the conclusion of the rule (the way would be through the must_contain, but only works on FP-Growth can not use the operator in the sequence Create Association Rule) . And in my case need to get the rules.
    The second alternative to the problem is that I do not know how can I restrict Max_Items (do not know what value should I put ideal).

    Given our discussion of this, I believe that it is useless to let the process run for all these days, I will not get a result. Consequently, the process can be aborted.

    You have another alternative for me to solve the problem?

    Thank you for your help!  ;)
  • Options
    haddockhaddock Member Posts: 849 Maven
    Hi there,

    If you increase the support floor the number of item sets will go down, but you could still end up with long item sets, and as we can see, the maths of powersets means that you may need to wait for a long time with itemsets of size sixteen.

    If on the other hand you limit the itemset size you can avoid this danger, but only at the expense of missing some possibly important pattern strings of longer length.

    In your position I'd set the itemset size low (3-4), and gradually build it up. Given the exponential entwine you may get interesting stuff at 6, but never finish on 7. Let the data talk!

    Best wishes,

    H
Sign In or Register to comment.