Can I do that?

Jorge · February 2009

Hi,

I'm trying to do a new "project", but I have some questions that I don't know how to solve them...

I have 5 atributtes with nominal data ( 4 or 5 possible values each one) and my "utopia" is...
- Assign a weight to the attributes and create a model with a learning operator. This model will have to predict 3 different values (Fast, mid, slow) after train it if I give a combination of the attributes.

I tried to use the NaiveBayes algorithm, but I can't assign a weight.
I thinked in tree learning operators, but they don't use ALL the attributes (only 1 or 2)...

Any advice you can give me will be welcomed :-P

PD: Sorry for my bad english :-(

Thanks,
Jorge

steffen · February 2009

Hello Jorge

1 .Assign weights is possible e.g. via InteractiveAttributeWeighting. Checkout Preprocess->Attributes->Weighting in the operator tree.

2. W-NaiveBayesUpdateable is a Naive Bayes Algorithmn of the Weka package and can handle (as far as I see) weights.

regards,

Steffen

Jorge · February 2009

Thanks for your fast reply

I tried the W-NaiveBayesUpdateable, but when I click in Validate appears that message in the console

G Feb 12, 2009 10:33:24 AM: [Warning] W-NaiveBayesUpdateable: W-NaiveBayesUpdateable: Deprecated: please use NaiveBayes instead.
G Feb 12, 2009 10:33:24 AM: [Warning] Deprecations: 1 usage of deprecated operators.

Can give me problems in the future versions of rapidminer? Is NaiveBayes the better algorithm to do that?

The InteractiveAttributeWeighting is perfect!! Thanks, but are there any preprocessing operator who permits me assign a different weight to the values?
e.g. fast is better than mid and mid is better than slow (or can I do that only in the training stage?)

Thanks another time, you're helping me a lot

Cheers,
Jorge

steffen · February 2009

Hello Jorge

@Naive Bayes
I guess Rapid-I wants you to use their implementations. But in my (completely subjective) point of view the W-NaiveBayesUpdateabe is better. I do not think that there will be any problems in the future, because integration of weka-operators is nearly state of the art in todays data mining tools.

@weighting
Uuuh...me sorry, I got you wrong. You want to assign ExampleWeights i.e. label weights, not AttributeWeights. I do not know what the best method is to do so, but here are some suggestions:
-> use AttributeConstruction to create a weightattribute. THe weights are used within w-naivebayesupdateable, otherwise you can use something like WeightedBootstrapping to create a weighted sample. Note that the higher the weight, the more important it is for the model to classify the examples correctly...so...
Here is an Example Process (golf.aml is a dataset delivered with rapidminer)


<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSource" class="ExampleSource">
        <parameter key="attributes"	value="golf.aml"/>
    </operator>
    <operator name="AttributeConstruction" class="AttributeConstruction">
        <list key="function_descriptions">
          <parameter key="myweight"	value="if(Play==&quot;yes&quot;,0.3,0.7)"/>
        </list>
    </operator>
    <operator name="ChangeAttributeRole" class="ChangeAttributeRole">
        <parameter key="name"	value="myweight"/>
        <parameter key="target_role"	value="weight"/>
    </operator>
</operator>

note that this can backfire. In the above example the overall auc is lower than without weighting

-> take a look at the various validationoperators, e.g. costevaluator as a performanceevalutor for optimizing validation chains.

Happy Mining

Steffen

PS: I am off now ...

Jorge · February 2009

In the weightings you weren't wrong.

I want to assign a weight to the attributes and to the examples. But in the first post I forgot to tell you :P

Thanks and I'm going to try all that you said me.

Cheers,
Jorge

land · February 2009

Hi,
if you want to assign a weight to the label in order to penalize predictionerrors depending on their outcome, you could use the MetaCost operator. Or you could simply use the learners confidence to decide if it would be better to switch to a more "expensive" class using the CostBasedThreasholdLearner.

If a tree does not use all of your attributes, then its possible, that the other attributes are simply not important to determine the label. This is not a problem at all, but preventing the model from being overfitted.

Greetings,
Sebastian

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Can I do that?

Answers