Nesting ValueSubgroupIterators

keithkeith Member Posts: 157 Maven
edited November 2018 in Help
Is it possible to nest ValueSubgroupIterators inside of one another so you can cycle over the values of more than one nominal attribute?  The operator info indicates that you can't combine multiple attributes within the definition of a single ValueSubgroupIterator, but it seemed like if you had the attributes separated and nested into their own nodes, it should work.

Here's an example of what I'm trying to do.  I want to normalize the data once, before modeling each subgroup separately, and then apply the preprocessing model and regression model to a new example set.  The problem is that the preprocessing model disappears when you get to the 2nd nested level of subgroup.  So when I try to use ModelGrouper to create a combined model, it fails.

<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
        <parameter key="target_function" value="sum"/>
    </operator>
    <operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
        <parameter key="attribute_name_regex" value="att1|att2"/>
        <parameter key="condition_class" value="attribute_name_filter"/>
        <operator name="BinDiscretization" class="BinDiscretization">
            <parameter key="number_of_bins" value="3"/>
        </operator>
    </operator>
    <operator name="Normalization" class="Normalization">
        <parameter key="return_preprocessing_model" value="true"/>
    </operator>
    <operator name="ValueSubgroupIterator" class="ValueSubgroupIterator" expanded="yes">
        <list key="attributes">
          <parameter key="att1" value="all"/>
        </list>
        <operator name="ValueSubgroupIterator (2)" class="ValueSubgroupIterator" expanded="yes">
            <list key="attributes">
              <parameter key="att2" value="all"/>
            </list>
            <operator name="W-LinearRegression" class="W-LinearRegression">
                <parameter key="keep_example_set" value="true"/>
            </operator>
            <operator name="ModelGrouper" class="ModelGrouper">
            </operator>
            <operator name="ExampleSetGenerator (2)" class="ExampleSetGenerator">
                <parameter key="target_function" value="sum"/>
            </operator>
            <operator name="ModelApplier" class="ModelApplier">
                <list key="application_parameters">
                </list>
                <parameter key="keep_model" value="true"/>
            </operator>
        </operator>
    </operator>
</operator>
Thanks for any ideas,
Keith

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Keith,
    the problem is, models are not passed to the inner operators of ValueSubgroupIterators. The only solution is to write the models into a file and reload it within the ValueSubgroupIterator.
    Another possibility could be to do it like this:
    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
            <parameter key="target_function" value="sum"/>
        </operator>
        <operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
            <parameter key="attribute_name_regex" value="att1|att2"/>
            <parameter key="condition_class" value="attribute_name_filter"/>
            <parameter key="deliver_inner_results" value="true"/>
            <operator name="BinDiscretization" class="BinDiscretization">
                <parameter key="number_of_bins" value="3"/>
                <parameter key="range_name_type" value="short"/>
                <parameter key="use_long_range_names" value="false"/>
            </operator>
        </operator>
        <operator name="Normalization" class="Normalization">
        </operator>
        <operator name="ValueSubgroupIterator" class="ValueSubgroupIterator" expanded="yes">
            <list key="attributes">
              <parameter key="att1" value="all"/>
            </list>
            <operator name="ValueSubgroupIterator (2)" class="ValueSubgroupIterator" expanded="yes">
                <list key="attributes">
                  <parameter key="att2" value="all"/>
                </list>
                <operator name="XValidation" class="XValidation" expanded="yes">
                    <parameter key="leave_one_out" value="true"/>
                    <operator name="W-LinearRegression" class="W-LinearRegression">
                        <parameter key="keep_example_set" value="true"/>
                    </operator>
                    <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                        <operator name="ModelApplier" class="ModelApplier">
                            <list key="application_parameters">
                            </list>
                            <parameter key="keep_model" value="true"/>
                        </operator>
                        <operator name="RegressionPerformance" class="RegressionPerformance">
                            <parameter key="root_mean_squared_error" value="true"/>
                        </operator>
                    </operator>
                </operator>
            </operator>
        </operator>
    </operator>
    But if this is appropriate depends on your task on hand.


    Greetings,
      Sebastian
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    just a small side note: in the upcoming release (4.3) you could also make use of the new operators "IOStorer" and "IORetrieval" instead of writing things to files. Those operators will store (and retrieve) arbitrary objects at arbitrary points of the process under a specified name. This was actually an idea Steffen gave us some time ago and it really extends the possibilities for processes with RapidMiner. Version 4.3 will be released tomorrow or latest during the weekend.

    Cheers,
    Ingo
Sign In or Register to comment.