Decsion Tree Weight Base with different Criterion

KianKian Member Posts: 8 Contributor I
edited December 2018 in Help

How i can do Decision tree weight base with different criterion (like gain_ratio,information_gain, gini_index, accuracy) with RapidMiner software.

in Decision tree (weight_base) operator, there is not this option.

Thanks

Tagged:

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    To help me understand better, are you referring to feeding weighted examples into the Decision Tree operator? If yes, you can do that with something like a Weight by Stratification operator and feed in the data into the Decision Tree operator. From there you can selec the gain_ration, information_gain, etc. 

     

    If you want to weigh attributes by gain_ratio, etc. You can use the Weight by Information Gain, Weight by Gini, etc operators. 

  • KianKian Member Posts: 8 Contributor I

    Thanks for reply, i think i should explain more.

    I did weighting on my main dataset and create datasets result in several weighitng operator.

    so i have a main dataset and several datast result in weighting operator in this dataset i have important attribute only.

    at second section i want to do deciosion tree weight base with different criterion.

    Example: dataset resultin gain ratio>> dicision tree weight base for examample with accuracy criterion and etc.

    IN Decision Tree operator there are 4 criterion but in Decision Tree weight-base operator there is not this option!

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Yes, that is correct. In the Decision Tree operator you can select 4 different criterion. The Weight by <....> operators allow only one scheme of splitting. 

  • KianKian Member Posts: 8 Contributor I

    Are there any solutions?!

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    You could do something like this. If you need to feed in the weights from one Weight By <...> downstream, use the Weights to Data operator. 

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.5.003">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.5.003" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="7.5.003" expanded="true" height="68" name="Retrieve Iris" width="90" x="45" y="34">
    <parameter key="repository_entry" value="//Samples/data/Iris"/>
    </operator>
    <operator activated="true" class="weight_by_information_gain" compatibility="7.5.003" expanded="true" height="82" name="Weight by Information Gain" width="90" x="179" y="34"/>
    <operator activated="true" class="weight_by_information_gain_ratio" compatibility="7.5.003" expanded="true" height="82" name="Weight by Information Gain Ratio" width="90" x="380" y="85"/>
    <operator activated="true" class="weight_by_gini_index" compatibility="7.5.003" expanded="true" height="82" name="Weight by Gini Index" width="90" x="581" y="187"/>
    <connect from_op="Retrieve Iris" from_port="output" to_op="Weight by Information Gain" to_port="example set"/>
    <connect from_op="Weight by Information Gain" from_port="weights" to_port="result 1"/>
    <connect from_op="Weight by Information Gain" from_port="example set" to_op="Weight by Information Gain Ratio" to_port="example set"/>
    <connect from_op="Weight by Information Gain Ratio" from_port="weights" to_port="result 2"/>
    <connect from_op="Weight by Information Gain Ratio" from_port="example set" to_op="Weight by Gini Index" to_port="example set"/>
    <connect from_op="Weight by Gini Index" from_port="weights" to_port="result 3"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    <portSpacing port="sink_result 4" spacing="0"/>
    </process>
    </operator>
    </process>
  • KianKian Member Posts: 8 Contributor I

    Thans Dear Thomas for your reply

    for weighting dataset there are solustion by using operatrs, but my problem is doing decision tree weight-base with different criterion! 

    my problem is in operator "decision tree weight base" there are not option for 4 criterion (accuracy gini index and etc).

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Can you post a process, I'm not clear on what it is you want to do exactly. Thanks.

  • KianKian Member Posts: 8 Contributor I
  • KianKian Member Posts: 8 Contributor I

    3.jpg is the process and 2 is subprocess of decision tree, problem is in decision tree weight base operator there is not optopn for criterion.

    2.jpg 12.2K
    3.jpg 19.6K
  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    I see now. I completely forgot about the Decision Treet Weight Based operator. That operator will not let you select the splitting criteria, you have to use one of the Weight By <...> operators inside. 

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.5.003">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="generate_nominal_data" compatibility="7.5.003" expanded="true" height="68" name="Generate Nominal Data" width="90" x="179" y="34">
    <parameter key="number_of_attributes" value="3"/>
    <parameter key="number_of_values" value="3"/>
    </operator>
    <operator activated="true" class="multiply" compatibility="7.5.003" expanded="true" height="124" name="Multiply" width="90" x="313" y="34"/>
    <operator activated="true" class="decision_tree_weight_based" compatibility="7.5.003" expanded="true" height="68" name="Decision Tree with Gini" width="90" x="648" y="238">
    <parameter key="confidence" value="0.01"/>
    <process expanded="true">
    <operator activated="true" class="weight_by_gini_index" compatibility="7.5.003" expanded="true" height="82" name="Weight by Gini Index" width="90" x="482" y="34"/>
    <connect from_port="training set" to_op="Weight by Gini Index" to_port="example set"/>
    <connect from_op="Weight by Gini Index" from_port="weights" to_port="weights"/>
    <portSpacing port="source_training set" spacing="0"/>
    <portSpacing port="sink_weights" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="decision_tree_weight_based" compatibility="7.5.003" expanded="true" height="68" name="Decision Tree with Gain Ratio" width="90" x="648" y="136">
    <process expanded="true">
    <operator activated="true" class="weight_by_information_gain_ratio" compatibility="7.5.003" expanded="true" height="82" name="Weight by Information Gain Ratio" width="90" x="482" y="34"/>
    <connect from_port="training set" to_op="Weight by Information Gain Ratio" to_port="example set"/>
    <connect from_op="Weight by Information Gain Ratio" from_port="weights" to_port="weights"/>
    <portSpacing port="source_training set" spacing="0"/>
    <portSpacing port="sink_weights" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="decision_tree_weight_based" compatibility="7.5.003" expanded="true" height="68" name="Decision Tree with Info Gain" width="90" x="648" y="34">
    <process expanded="true">
    <operator activated="true" class="weight_by_information_gain" compatibility="7.5.003" expanded="true" height="82" name="Weight by Information Gain" width="90" x="313" y="34"/>
    <connect from_port="training set" to_op="Weight by Information Gain" to_port="example set"/>
    <connect from_op="Weight by Information Gain" from_port="weights" to_port="weights"/>
    <portSpacing port="source_training set" spacing="0"/>
    <portSpacing port="sink_weights" spacing="0"/>
    </process>
    </operator>
    <connect from_op="Generate Nominal Data" from_port="output" to_op="Multiply" to_port="input"/>
    <connect from_op="Multiply" from_port="output 1" to_op="Decision Tree with Info Gain" to_port="training set"/>
    <connect from_op="Multiply" from_port="output 2" to_op="Decision Tree with Gain Ratio" to_port="training set"/>
    <connect from_op="Multiply" from_port="output 3" to_op="Decision Tree with Gini" to_port="training set"/>
    <connect from_op="Decision Tree with Gini" from_port="model" to_port="result 3"/>
    <connect from_op="Decision Tree with Gain Ratio" from_port="model" to_port="result 2"/>
    <connect from_op="Decision Tree with Info Gain" from_port="model" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    <portSpacing port="sink_result 4" spacing="0"/>
    </process>
    </operator>
    </process>
Sign In or Register to comment.