"Binary2MultiClassLearner Bug?"

QuarrelQuarrel Member Posts: 6 Contributor II
edited May 23 in Help
Hi,

After getting great service when I reported my KernelPCA bug (fixed in 4.5 it seems - thanks!), I thought I'd throw another out there that tripped me up before, and again tonight.

When using the meta learner Binary2MultiClassLearner, the inner learners can sometimes still complain about getting polynomial labels. Usually the first run of my XV will work, then the 2nd will fail - mostly this is with some other inner meta-learner (a booster / bagger etc).

So, taking the sample 24_Binary2MultiClassLearner.xml, in RM 4.5 and make a small modification to get the learner to be a BayesianBoosting->RuleLearner like so:
<operator name="Root" class="Process" expanded="yes">
    <description text="#ylt#p#ygt#The meta learning schemes used in this setup is a binary to multi class converter. This allows the learning of model for polynominal data sets (i.e. for data sets with more than two classes) by learners supporting only binominal classes. #ylt#/p#ygt# "/>
    <operator name="ExampleSource" class="ExampleSource">
        <parameter key="attributes" value="../data/iris.aml"/>
    </operator>
    <operator name="XValidation" class="XValidation" expanded="yes">
        <operator name="Binary2MultiClassLearner" class="Binary2MultiClassLearner" expanded="yes">
            <operator name="BayesianBoosting" class="BayesianBoosting" expanded="yes">
                <operator name="RuleLearner" class="RuleLearner">
                    <parameter key="pureness" value="0.6"/>
                    <parameter key="minimal_prune_benefit" value="0.1"/>
                </operator>
            </operator>
        </operator>
        <operator name="OperatorChain" class="OperatorChain" expanded="yes">
            <operator name="ModelApplier" class="ModelApplier">
                <list key="application_parameters">
                </list>
            </operator>
            <operator name="ClassificationPerformance" class="ClassificationPerformance">
                <parameter key="classification_error" value="true"/>
                <list key="class_weights">
                </list>
            </operator>
        </operator>
    </operator>
</operator>
I get the following message (although not on the first call):

G Aug 1, 2009 12:02:50 AM: [Fatal] UserError occured in 7th application of BayesianBoosting (BayesianBoosting)
G Aug 1, 2009 12:02:50 AM: [Fatal] Process failed: This learning scheme does not have sufficient capabilities for the given data set: polynominal label not supported


Am I doing this correctly? Any help appreciated.

(To date I've actually been pre-processing my data to essentially do my own version of a 1-vs-all, but not having to go through a manual step would be great)


--Q
Tagged:

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,525   Unicorn
    Hi,
    that seems to be a very strange interaction between the Binary2MulticlassLearner and the BayesianBoosting operator. All other learner do not impose that strange behavior, that in the x-th cross validation run all examples are of one class. I will try to solve this problem next week.

    Greetings,
      Sebastian
  • fischerfischer Member Posts: 439  Guru
    Hi Quarrel,

    we fixed this. In fact, this was a problem in the boosting model which added an unused "all_other_values" label value, which then confused the learner in the next itetation.

    Best,
    Simon
  • QuarrelQuarrel Member Posts: 6 Contributor II
    Fantastic, thanks very much.

    I've had another bug that may be related - I'm having a little trouble finding a simple example that demostrates it, but will mention it now and can write it up further if you dont think it is related.

    When using Adaboosting with a RuleLearner very similar to the set up shown above, the reported confidences seem to SOMETIMES (ie not always) completely incorrect. The predicted label maybe be A but then confidences for A, B, C might be 0.04, 0.34, 0.62 respectively - seemingly indicating the predicted label should have been C.

    Do you think the bug you just found could be related?

    (hmm.. re-reading this I realise that I'm confusing my bayes and my ada boosting - i think I've only experienced the confidence bug with adaboosting, not bayes... So perhaps this won't be helpful at all)


    --Q
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,525   Unicorn
    Hi,
    if you could find an exampleprocess reproducing this error, please post it. We will then be able to test if this is a bug or a desired feature :)

    Greetings,
      Sebastian

    PS: Have you already registered for the alpha test of the next version? You seem to be a great bug finder and much more important: reporter :)
  • QuarrelQuarrel Member Posts: 6 Contributor II
    Thanks- I'll try and find a decent test case. Unfortunately my own data files tend to be rather large, but will eventually find a simple example.

    I'll go look for where to register now :) Didn't realise there was an alpha test program. Was just looking in to checking out src and rebuilding the jar. Might be simpler if I leave it to you guys that know the src base.


    --Q
Sign In or Register to comment.