The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.

No way to free memory?

jmichiels123jmichiels123 Member Posts: 3 Contributor I
edited November 2018 in Help
Hello,

Currently my setup is

Read arff -> Nominal to Binominal -> FPGrowth

After 1 run, my memory is full (6Gig ram), but I get a clean result. If I run this again, I soon get an outOfMemory expception.
Now my question is: How can I clear the memory after a process execution? I don't want to restart RM every time.


Second question:

In FPGrowth I only want itemsets which have attributes with the prefix "proc" in it. When I type proc[A-Za-z0-9]+  in the "only include" bar, I don't get what I wanted. I still get a lot of itemsets that don't include those item(s). In fact I only want association rules with those items in the consequent, how can I do that?

Thank you in advance

Answers

  • haddockhaddock Member Posts: 849 Maven
    Hi there,

    I repetitively grind frequent item sets in a loop, and keep the memory under control by 'materialising' the data, doing the jobbies and then explicitly scrubbing memory, like this...
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.003">
      <context>
        <input>
          <location>FX Av(C) Weekly</location>
          <location>Pre-Processed FX</location>
        </input>
        <output>
          <location>PnP</location>
        </output>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.2.003" expanded="true" name="Process">
        <process expanded="true" height="362" width="768">
          <operator activated="true" class="set_macro" compatibility="5.2.003" expanded="true" height="76" name="Set Macro" width="90" x="179" y="165">
            <description>This is the root string that will be matched against each pair. Leave as "." for all pairs.("."==.)</description>
            <parameter key="macro" value="sym"/>
            <parameter key="value" value="."/>
          </operator>
          <operator activated="true" class="loop_attributes" compatibility="5.2.003" expanded="true" height="60" name="Forall Symbols" width="90" x="313" y="165">
            <parameter key="attribute_filter_type" value="regular_expression"/>
            <parameter key="regular_expression" value=".*%{sym}.*"/>
            <parameter key="use_except_expression" value="true"/>
            <parameter key="except_regular_expression" value="DateTime|NEXT"/>
            <parameter key="value_type" value="polynominal"/>
            <process expanded="true" height="347" width="821">
              <operator activated="true" class="materialize_data" compatibility="5.2.003" expanded="true" height="76" name="Materialize Data" width="90" x="112" y="30"/>
              <operator activated="true" class="paracuda:Peek" compatibility="1.1.001" expanded="true" height="60" name="Peek" width="90" x="246" y="30">
                <parameter key="source_att" value="%{loop_attribute}"/>
              </operator>
              <operator activated="true" class="loop_values" compatibility="5.2.003" expanded="true" height="76" name="Foreach  Period" width="90" x="380" y="30">
                <parameter key="attribute" value="DateTime"/>
                <process expanded="true" height="237" width="500">
                  <operator activated="true" class="filter_examples" compatibility="5.2.003" expanded="true" height="76" name="Filter timeslots" width="90" x="112" y="30">
                    <parameter key="condition_class" value="attribute_value_filter"/>
                    <parameter key="parameter_string" value="DateTime=%{loop_value}"/>
                  </operator>
                  <operator activated="true" class="work_on_subset" compatibility="5.2.003" expanded="true" height="76" name="Using Cues" width="90" x="246" y="30">
                    <parameter key="attribute_filter_type" value="value_type"/>
                    <parameter key="value_type" value="binominal"/>
                    <process expanded="true" height="371" width="567">
                      <operator activated="true" class="paracuda:Associator" compatibility="1.1.001" expanded="true" height="76" name="Associator" width="90" x="112" y="30"/>
                      <operator activated="true" class="log" compatibility="5.2.003" expanded="true" height="76" name="Store Result" width="90" x="313" y="30">
                        <list key="log">
                          <parameter key="Period" value="operator.Foreach  Period.value.current_value"/>
                          <parameter key="Symbol" value="operator.Forall Symbols.value.feature_name"/>
                          <parameter key="Label_Sets" value="operator.Associator.value.bean_count"/>
                        </list>
                      </operator>
                      <connect from_port="exampleSet" to_op="Associator" to_port="example set"/>
                      <connect from_op="Associator" from_port="example set" to_op="Store Result" to_port="through 1"/>
                      <connect from_op="Store Result" from_port="through 1" to_port="example set"/>
                      <portSpacing port="source_exampleSet" spacing="0"/>
                      <portSpacing port="sink_example set" spacing="0"/>
                      <portSpacing port="sink_through 1" spacing="0"/>
                    </process>
                  </operator>
                  <connect from_port="example set" to_op="Filter timeslots" to_port="example set input"/>
                  <connect from_op="Filter timeslots" from_port="example set output" to_op="Using Cues" to_port="example set"/>
                  <connect from_op="Using Cues" from_port="example set" to_port="out 1"/>
                  <portSpacing port="source_example set" spacing="0"/>
                  <portSpacing port="sink_out 1" spacing="0"/>
                  <portSpacing port="sink_out 2" spacing="0"/>
                </process>
              </operator>
              <operator activated="true" class="free_memory" compatibility="5.2.003" expanded="true" height="76" name="Free Memory" width="90" x="581" y="30"/>
              <connect from_port="example set" to_op="Materialize Data" to_port="example set input"/>
              <connect from_op="Materialize Data" from_port="example set output" to_op="Peek" to_port="example set"/>
              <connect from_op="Peek" from_port="example set" to_op="Foreach  Period" to_port="example set"/>
              <connect from_op="Foreach  Period" from_port="out 1" to_op="Free Memory" to_port="through 1"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="sink_example set" spacing="108"/>
            </process>
          </operator>
          <operator activated="true" class="subprocess" compatibility="5.2.003" expanded="true" height="76" name="Av.Wk. Cues" width="90" x="447" y="165">
            <process expanded="true" height="269" width="748">
              <operator activated="true" class="log_to_data" compatibility="5.2.003" expanded="true" height="94" name="Convert Log" width="90" x="112" y="30"/>
              <operator activated="true" class="pivot" compatibility="5.2.003" expanded="true" height="76" name="Table View" width="90" x="246" y="30">
                <parameter key="group_attribute" value="Period"/>
                <parameter key="index_attribute" value="Symbol"/>
                <parameter key="consider_weights" value="false"/>
                <parameter key="skip_constant_attributes" value="false"/>
              </operator>
              <operator activated="true" class="guess_types" compatibility="5.2.003" expanded="true" height="76" name="Guess Types" width="90" x="380" y="30">
                <parameter key="attribute_filter_type" value="regular_expression"/>
                <parameter key="regular_expression" value="Period|Label.*"/>
              </operator>
              <operator activated="true" class="rename_by_replacing" compatibility="5.2.003" expanded="true" height="76" name="Rename by Replacing" width="90" x="514" y="30">
                <parameter key="replace_what" value="\W+"/>
              </operator>
              <connect from_port="in 1" to_op="Convert Log" to_port="through 1"/>
              <connect from_op="Convert Log" from_port="exampleSet" to_op="Table View" to_port="example set input"/>
              <connect from_op="Table View" from_port="example set output" to_op="Guess Types" to_port="example set input"/>
              <connect from_op="Guess Types" from_port="example set output" to_op="Rename by Replacing" to_port="example set input"/>
              <connect from_op="Rename by Replacing" from_port="example set output" to_port="out 1"/>
              <portSpacing port="source_in 1" spacing="0"/>
              <portSpacing port="source_in 2" spacing="0"/>
              <portSpacing port="sink_out 1" spacing="0"/>
              <portSpacing port="sink_out 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="join" compatibility="5.2.003" expanded="true" height="76" name="Cues V Closes" width="90" x="514" y="30">
            <parameter key="remove_double_attributes" value="false"/>
            <parameter key="use_id_attribute_as_key" value="false"/>
            <list key="key_attributes">
              <parameter key="Period" value="Period"/>
            </list>
          </operator>
          <connect from_port="input 1" to_op="Cues V Closes" to_port="left"/>
          <connect from_port="input 2" to_op="Set Macro" to_port="through 1"/>
          <connect from_op="Set Macro" from_port="through 1" to_op="Forall Symbols" to_port="example set"/>
          <connect from_op="Forall Symbols" from_port="example set" to_op="Av.Wk. Cues" to_port="in 1"/>
          <connect from_op="Av.Wk. Cues" from_port="out 1" to_op="Cues V Closes" to_port="right"/>
          <connect from_op="Cues V Closes" from_port="join" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="source_input 2" spacing="0"/>
          <portSpacing port="source_input 3" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    I think you are right to say that regex pattern finding does not apply in the FPGrowth 'must contain' parameter, you may well have to roll your own extension for that, as I did for FIS grinding on a CUDA card. What a treat!

    Have fun!

Sign In or Register to comment.