Options

MultipleLabelIterator: how to specify positive/negative attribute values?

Legacy UserLegacy User Member Posts: 0 Newbie
edited November 2018 in Help
I'm using the MultipleLabelIterator, following the sample 07_Meta/05_MultipleLabelLearning.xml. However, I'm pulling my data from a database using DatabaseExampleSource. Then I apply ChangeAttributeRole operators to each attribute to make it of type 'label1', 'label2', and so on. The result looks like the sample dataset with 'positive' or 'negative' nominal features depending on whether each row exemplifies the given feature.

When run, RapidMiner fails on the AverageBuilder operator: "Cannot build average for different positive classes (positive/negative)."

Looking at the datasets I see that in the sample data, the Range for each label# feature is always "positive(##), negative(##)". In my dataset, I see that some features are listed as "negative(##), positive(##)".

It seems that RapidMiner is not relating the values 'positive' and 'negative' but instead is using their positions, which are loading inconsistently.

Is there a way to tell RapidMiner which nominal value is the positive classname? Or another way to work around this error?

Thanks,
Gary

Answers

  • Options
    haddockhaddock Member Posts: 849 Maven
    The error is your own, check out the last entry here http://rapid-i.com/rapidforum/index.php/topic,776.0.html.

    Just to prove the point, again, by using the "classes" slot the error is avoided.
    <operator name="Root" class="Process" expanded="yes">
        <operator name="MultipleLabelGenerator" class="MultipleLabelGenerator">
        </operator>
        <operator name="NoiseGenerator" class="NoiseGenerator">
            <list key="noise">
            </list>
        </operator>
        <operator name="DatabaseExampleSetWriter" class="DatabaseExampleSetWriter">
            <parameter key="database_system" value="Microsoft SQL Server (Microsoft)"/>
            <parameter key="database_url" value="jdbc:sqlserver://localhost:1433;databaseName=Tradestation"/>
            <parameter key="username" value="sa"/>
            <parameter key="password" value="wL8/6ZO7YrXKa8XgQd4v7g=="/>
            <parameter key="table_name" value="Table1"/>
            <parameter key="overwrite_mode" value="overwrite"/>
        </operator>
        <operator name="DatabaseExampleSource" class="DatabaseExampleSource">
            <parameter key="database_system" value="Microsoft SQL Server (Microsoft)"/>
            <parameter key="database_url" value="jdbc:sqlserver://localhost:1433;databaseName=Tradestation"/>
            <parameter key="username" value="sa"/>
            <parameter key="password" value="wL8/6ZO7YrXKa8XgQd4v7g=="/>
            <parameter key="table_name" value="Table1"/>
            <parameter key="label_attribute" value="label1"/>
            <parameter key="classes" value="positive negative"/>
        </operator>
        <operator name="ChangeAttributeRole" class="ChangeAttributeRole">
            <parameter key="name" value="label2"/>
            <parameter key="target_role" value="label"/>
        </operator>
        <operator name="MultipleLabelIterator" class="MultipleLabelIterator" expanded="yes">
            <operator name="XValidation" class="XValidation" expanded="yes">
                <parameter key="sampling_type" value="shuffled sampling"/>
                <operator name="DecisionTree" class="DecisionTree">
                    <parameter key="minimal_size_for_split" value="10"/>
                    <parameter key="minimal_leaf_size" value="5"/>
                </operator>
                <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                    <operator name="ModelApplier" class="ModelApplier">
                        <list key="application_parameters">
                        </list>
                    </operator>
                    <operator name="Performance" class="Performance">
                    </operator>
                </operator>
            </operator>
        </operator>
        <operator name="AverageBuilder" class="AverageBuilder">
        </operator>
    </operator>
  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hello Gary,

    in the latest developer version there is a new operator called "InternalBinominalRemapping" which can exactly be used for this, i.e. for defining the positive class. Until this is released (if you do not want to checkout and compile it yourself), you could instead use a loop over all features reload the data from the database by using the current feature as macro for the label column and define the classes in the corresponding parameter of the DatabaseExampleSource instead of using the multiple label iterator.

    Cheers,
    Ingo
  • Options
    Legacy UserLegacy User Member Posts: 0 Newbie

    @haddock: Thanks, that's a good tip to know. It seems that it doesn't solve the problem in this case, however. Using the classes tag appears to affect only the attribute named as the 'label_attribute'. I can see the labels reverse in the Range column of the Data Table view as I pick different labels in the database source operator. But only one attribute at a time. I need to change several of them.

    @Ingo: I don't see the InternalBinominalRemapping operator in the community CVS repository I've downloaded and updated. Does it have a different name or is it in the Enterprise repository?

    Thanks!
  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi again,

    as I said this operator was introduced in the latest developer version which is currently the branch "Wasat". Here are more information about CVS access:

    http://rapid-i.com/rapidforum/index.php/topic,294.0.html

    Cheers,
    Ingo
  • Options
    Legacy UserLegacy User Member Posts: 0 Newbie

    Thanks, Ingo. I have it working, now...

    Notes for others:
    • The new operator didn't show up in RapidMiner after I switched to the Wasat branch and reran. I had to redo 'ant copy-resources'
    • The operator has a checkbox for 'apply to special attributes'. The original 'label' attribute and the new ones changed to 'label1', 'label2', etc are special, so check this box.
    • The 'attributes' field in the operator takes a regular expression, so 'label\d+' (without quotes) works if your attributes are 'label1', 'label2', etc
    Thanks,
    Gary
Sign In or Register to comment.