"Association Rules, taking very long time"

zainab2013zainab2013 Member Posts: 7 Contributor II
edited June 2019 in Help
Hi all,
I'm trying to run the tutorial of association rules, but it takes very long time and still not getting any results.
I waited for several hours but didn't get the results.
my laptop is Toshiba core 2 due centrino, with 2.5 Ram.
do I need a very high specifications laptop?? or is there another problem ?

Answers

  • zainab2013zainab2013 Member Posts: 7 Contributor II
    no answer ??????????
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi,

    how much data are you processing? How many rows and columns do you pass into the Association Rules and/or FP-Growth operator?
    Due to the algorithm those operators need some time and memory on bigger datasets, but they can probably be tweaked a little to perform faster.

    As always, it would also be a good idea to post your process setup, as described in the link in my signature.

    Best regards,
    Marius
  • zainab2013zainab2013 Member Posts: 7 Contributor II
    Actually I have tried different size data sets, ranging from 150 to 4000 instances. But  I didn't get any results. The last time I have run the tutorial process and waited for it more than 15 hours uselessly.
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Can you please post your process setup?

    Best regards,
    Marius
  • zainab2013zainab2013 Member Posts: 7 Contributor II
    retrieve Iris dataset  ->  Descretize by frequency    ->    Nominal to binominal    ->    FP-Growth    ->    Create Association Rules
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Thank you. Which operator is actually consuming most of the time? Maybe it can be tweaked; I can have a look if you post your process setup as described below:

    Open your process in RapidMiner and open the XML view, which is usually located above the process view. If you can't find it, make sure that it is checked in the menu under View -> Show View.
    Copy the XML code from there and paste it into your forum post. Surround your pasted XML code with code tags as shown in the example below (omit the space before the slash in the second code-tag).

    Code:
    your XML code here[ /code]
    You can also simply use the "#"-button above the input box in the forum.
  • zainab2013zainab2013 Member Posts: 7 Contributor II
    Thanks a lot for your effort. Here's the XML code for the process :
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.0.002">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" breakpoints="after" class="retrieve" compatibility="6.0.002" expanded="true" height="60" name="Iris" width="90" x="45" y="120">
            <parameter key="repository_entry" value="//Samples/data/Iris"/>
          </operator>
          <operator activated="true" class="discretize_by_frequency" compatibility="6.0.002" expanded="true" height="94" name="Discretize by Frequency" width="90" x="179" y="120">
            <parameter key="number_of_bins" value="5"/>
            <parameter key="range_name_type" value="short"/>
          </operator>
          <operator activated="true" breakpoints="after" class="nominal_to_binominal" compatibility="6.0.002" expanded="true" height="94" name="Nominal to Binominal" width="90" x="313" y="120">
            <parameter key="transform_binominal" value="true"/>
            <parameter key="use_underscore_in_name" value="true"/>
          </operator>
          <operator activated="true" class="fp_growth" compatibility="6.0.002" expanded="true" height="76" name="FPGrowth" width="90" x="447" y="120">
            <parameter key="find_min_number_of_itemsets" value="false"/>
            <parameter key="min_number_of_itemsets" value="1"/>
            <parameter key="min_support" value="0.1"/>
          </operator>
          <operator activated="true" class="create_association_rules" compatibility="6.0.002" expanded="true" height="76" name="Create Association Rules" width="90" x="581" y="120"/>
          <connect from_op="Iris" from_port="output" to_op="Discretize by Frequency" to_port="example set input"/>
          <connect from_op="Discretize by Frequency" from_port="example set output" to_op="Nominal to Binominal" to_port="example set input"/>
          <connect from_op="Nominal to Binominal" from_port="example set output" to_op="FPGrowth" to_port="example set"/>
          <connect from_op="FPGrowth" from_port="frequent sets" to_op="Create Association Rules" to_port="item sets"/>
          <connect from_op="Create Association Rules" from_port="rules" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="90"/>
          <portSpacing port="sink_result 2" spacing="18"/>
        </process>
      </operator>
    </process>
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi,

    from your code I see that you have a quite low number for the minimum support in FPGrowth. That can result in a very big list of frequent item sets, which can exceed your memory limit, and of course finding the association rules is also more expensive on an excessive list of item sets. If you use a low support also in your production process, you could try to increase the minimum support.

    Best regards,
    Marius
Sign In or Register to comment.