The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.

FeatureIterator won't work with AverageBuilder?

Legacy UserLegacy User Member Posts: 0 Newbie
edited November 2018 in Help

The MultilabelIterator example (in the Meta folder) has an AverageBuilder at the end. It collects the PerformanceVectors created during the crossvalidation runs and creates a final, overall performance metric.

But if I change the loop operator to a FeatureIterator, the AverageBuilder complains:

pre]AverageBuilder: Missing input: AverageVector[/pre]

To reproduce, start from the MultilabelIterator sample, add a FeatureIterator, move the inner operator of the MultilabelIterator into the FeatureIterator, add a ChangeAttributeRole into the FeatureIterator to change the labels. Finally, set the parameters. Here's the final version. Click "Tools | Validate" to see the error. You can run it with breakpoints to see it doing the right thing with the labels, but no performance output is created.

<operator name="Root" class="Process" expanded="yes">
    <operator name="MultipleLabelGenerator" class="MultipleLabelGenerator">
    </operator>
    <operator name="NoiseGenerator" class="NoiseGenerator">
        <list key="noise">
        </list>
    </operator>
    <operator name="FeatureIterator" class="FeatureIterator" expanded="yes">
        <parameter key="filter" value="label.*"/>
        <parameter key="work_on_special" value="true"/>
        <operator name="ChangeAttributeRole" class="ChangeAttributeRole">
            <parameter key="name" value="%{loop_feature}"/>
            <parameter key="target_role" value="label"/>
        </operator>
        <operator name="XValidation" class="XValidation" expanded="yes">
            <parameter key="sampling_type" value="shuffled sampling"/>
            <operator name="DecisionTree" class="DecisionTree">
                <parameter key="minimal_size_for_split" value="10"/>
                <parameter key="minimal_leaf_size" value="5"/>
            </operator>
            <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                <operator name="ModelApplier" class="ModelApplier">
                    <list key="application_parameters">
                    </list>
                </operator>
                <operator name="Performance" class="Performance">
                </operator>
            </operator>
        </operator>
    </operator>
    <operator name="AverageBuilder" class="AverageBuilder">
    </operator>
</operator>

Answers

  • haddockhaddock Member Posts: 849 Maven
    Hi there!

    Looks like the feature iterator is munching up the performance vectors, and that RM knows it does, hence the validation errors. If you explicitly store them as they are made you can then restore them and the average builder will do its stuff, like this...
    <operator name="Root" class="Process" expanded="yes">
        <operator name="MultipleLabelGenerator" class="MultipleLabelGenerator">
        </operator>
        <operator name="NoiseGenerator" class="NoiseGenerator">
            <list key="noise">
            </list>
        </operator>
        <operator name="FeatureIterator" class="FeatureIterator" expanded="yes">
            <parameter key="filter" value="label.*"/>
            <parameter key="work_on_special" value="true"/>
            <operator name="ChangeAttributeRole" class="ChangeAttributeRole">
                <parameter key="name" value="%{loop_feature}"/>
                <parameter key="target_role" value="label"/>
            </operator>
            <operator name="XValidation" class="XValidation" expanded="yes">
                <parameter key="average_performances_only" value="false"/>
                <parameter key="sampling_type" value="shuffled sampling"/>
                <operator name="DecisionTree" class="DecisionTree">
                    <parameter key="minimal_size_for_split" value="10"/>
                    <parameter key="minimal_leaf_size" value="5"/>
                </operator>
                <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                    <operator name="ModelApplier" class="ModelApplier">
                        <list key="application_parameters">
                        </list>
                    </operator>
                    <operator name="Performance" class="Performance">
                    </operator>
                    <operator name="IOStorer" class="IOStorer">
                        <parameter key="name" value="perf_%{a}"/>
                        <parameter key="io_object" value="PerformanceVector"/>
                        <parameter key="remove_from_process" value="false"/>
                    </operator>
                </operator>
            </operator>
        </operator>
        <operator name="IORetriever" class="IORetriever">
            <parameter key="name" value="perf_1"/>
            <parameter key="io_object" value="PerformanceVector"/>
        </operator>
        <operator name="IORetriever (2)" class="IORetriever">
            <parameter key="name" value="perf_2"/>
            <parameter key="io_object" value="PerformanceVector"/>
        </operator>
        <operator name="IORetriever (3)" class="IORetriever">
            <parameter key="name" value="perf_3"/>
            <parameter key="io_object" value="PerformanceVector"/>
        </operator>
        <operator name="AverageBuilder" class="AverageBuilder">
        </operator>
    </operator>
Sign In or Register to comment.