Can I do that?

JorgeJorge Member Posts: 19 Maven
edited November 2018 in Help

I'm trying to do a new "project", but I have some questions that I don't know how to solve them...

I have 5 atributtes with nominal data ( 4 or 5 possible values each one) and my "utopia" is...
- Assign a weight to the attributes and create a model with a learning operator. This model will have to predict 3 different values (Fast, mid, slow) after train it if I give a combination of the attributes.

I tried to use the NaiveBayes algorithm, but I can't assign a weight.
I thinked in tree learning operators, but they don't use ALL the attributes (only 1 or 2)...

Any advice you can give me will be welcomed :-P

PD: Sorry for my bad english :-(



  • steffensteffen Member Posts: 347 Maven
    Hello Jorge

    1 .Assign weights is possible e.g. via InteractiveAttributeWeighting. Checkout Preprocess->Attributes->Weighting in the operator tree.

    2. W-NaiveBayesUpdateable is a Naive Bayes Algorithmn of the Weka package and can handle (as far as I see) weights.


  • JorgeJorge Member Posts: 19 Maven
    Thanks for your fast reply  :)

    I tried the W-NaiveBayesUpdateable, but when I click in Validate appears that message in the console

    G Feb 12, 2009 10:33:24 AM: [Warning] W-NaiveBayesUpdateable: W-NaiveBayesUpdateable: Deprecated: please use NaiveBayes instead.
    G Feb 12, 2009 10:33:24 AM: [Warning] Deprecations: 1 usage of deprecated operators.

    Can give me problems in the future versions of rapidminer? Is NaiveBayes the better algorithm to do that?

    The InteractiveAttributeWeighting is perfect!! Thanks, but are there any preprocessing operator who permits me assign a different weight to the values?
    e.g. fast is better than mid and mid is better than slow (or can I do that only in the training  stage?)

    Thanks another time, you're helping me a lot :)

  • steffensteffen Member Posts: 347 Maven
    Hello Jorge

    @Naive Bayes
    I guess Rapid-I wants you to use their implementations. But in my (completely subjective) point of view  the W-NaiveBayesUpdateabe is better. I do not think that there will be any problems in the future, because integration of weka-operators is nearly state of the art in todays data mining tools.

    @weighting sorry, I got you wrong. You want to assign ExampleWeights i.e. label weights, not AttributeWeights. I do not know what the best method is to do so, but here are some suggestions:
    -> use AttributeConstruction to  create a weightattribute. THe weights are used within w-naivebayesupdateable, otherwise you can use something like WeightedBootstrapping to create a weighted sample. Note that the higher the weight, the more important it is for the model to classify the examples
    Here is an Example Process (golf.aml is a dataset delivered with rapidminer)

    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSource" class="ExampleSource">
            <parameter key="attributes" value="golf.aml"/>
        <operator name="AttributeConstruction" class="AttributeConstruction">
            <list key="function_descriptions">
              <parameter key="myweight" value="if(Play==&quot;yes&quot;,0.3,0.7)"/>
        <operator name="ChangeAttributeRole" class="ChangeAttributeRole">
            <parameter key="name" value="myweight"/>
            <parameter key="target_role" value="weight"/>
    note that this can backfire. In the above example  the overall auc is lower than without weighting ;)

    -> take a look at the various validationoperators, e.g. costevaluator as a performanceevalutor for optimizing validation chains.

    Happy Mining


    PS: I am off now ...
  • JorgeJorge Member Posts: 19 Maven
    In the weightings you weren't wrong.

    I want to assign a weight to the attributes and to the examples. But in the first post I forgot to tell you  :P

    Thanks and I'm going to try all that you said me.

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    if you want to assign a weight to the label in order to penalize predictionerrors depending on their outcome, you could use the MetaCost operator. Or you could simply use the learners confidence to decide if it would be better to switch to a more "expensive" class using the CostBasedThreasholdLearner.

    If a tree does not use all of your attributes, then its possible, that the other attributes are simply not important to determine the label. This is not a problem at all, but preventing the model from being overfitted.

Sign In or Register to comment.