"Weighted Examples do not work out?"

xxhasan88xxxxhasan88xx Member Posts: 4 Contributor I
edited June 2019 in Help
Hi everyone,
i have a problem with class unbalanced data and example weighting. I have a dataset with 184 positive examples and 2200 negative ones. I know that there exist some solutions for that (e.g. sampling, weight attribute generation, cost-sensitive learning etc.).

I generate an attribut "weight" with the operator "Generate Weight (Stratification)" which assigns weights to all examples. However, this does not change anything in my results! My decision tree is the same as before. This problem exists also for Rule-Learners. Furthermore, this problem also exists if I generate a weight attribute manually (with functional expressions).

However, if i take a decision tree from the Weka-Extension (e.g. W-J48), it works and the tree seems to apply the example weights.

Now my question is, why doesn't the rapidminer decision tree seem to handle the example weights? What am I doing wrong? Whatever weights I generate, they do not work.

Thank you in advance.

Here you can see my process.
<process version="5.3.015">
  <operator activated="true" class="process" compatibility="5.3.015" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="5.3.015" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
        <parameter key="repository_entry" value="../MiningData/BaseMiningTable"/>
      </operator>
      <operator activated="true" class="generate_weight_stratification" compatibility="5.3.015" expanded="true" height="76" name="Generate Weight (2)" width="90" x="179" y="30">
        <parameter key="total_weight" value="10000.0"/>
      </operator>
      <operator activated="true" class="decision_tree" compatibility="5.3.015" expanded="true" height="76" name="Decision Tree (2)" width="90" x="313" y="30">
        <parameter key="minimal_leaf_size" value="5"/>
        <parameter key="minimal_gain" value="0.01"/>
        <parameter key="maximal_depth" value="5"/>
      </operator>
      <connect from_op="Retrieve" from_port="output" to_op="Generate Weight (2)" to_port="example set input"/>
      <connect from_op="Generate Weight (2)" from_port="example set output" to_op="Decision Tree (2)" to_port="training set"/>
      <connect from_op="Decision Tree (2)" from_port="model" to_port="result 1"/>
      <connect from_op="Decision Tree (2)" from_port="exampleSet" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>
Tagged:

Answers

  • frasfras Member Posts: 93 Contributor II
    The weight attributes have a special role. This is the reason why the rapidminer tree ignores them.
    Use operator "Set Role" to change the role to "regular".
  • xxhasan88xxxxhasan88xx Member Posts: 4 Contributor I

    Thanks but unfortunately this is not the solution because

    1) I want the weight attribute to be the weight (and not to be a regular attribute)  :)
        The Rapidminer operators just don't apply the weights.

    2) the "Generate Weight (Stratification)" Operator automatically sets the role to "weight" and this is what i need.

    3) Nevertheless if I use "set Role" to "weight", the problem still exists  :-\
  • xxhasan88xxxxhasan88xx Member Posts: 4 Contributor I
    Has nobody a solution?

    Can at least anybody confirm that he/she has successfully used weighted examples with Rapidminer Operators?
  • mafern76mafern76 Member Posts: 45 Contributor II
    Are you sure?

    For me, Decision Tree (Parallel) seems to be using weights.

    Why did you set total weight at 10000? Could this be causing something? I just left it at 1.
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist
    i frequently use weights. At least in 6.X there is no problem i know of.

    Cheers
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.