🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉

Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.

CLICK HERE TO DOWNLOAD

order of class names in confusion matrix

PireheloPirehelo Member Posts: 12 Contributor I
edited November 2018 in Help

Hello fellow data analysts,

 

I was wondering if I can change the order of classes in a confusion matrix. For instance, I have five classes, which I want them to appear in conf matrix in the following order: Good, Satisfactory, Fair, Poor, Very Poor (i e their order matters). But now they are sorted this way: Very Poor, Fair, Satisfactory, Poor, Good. Based on what rationale rapidminer sorts the labels like this? Obviously, it is not alphabetical. PS the type of label attribute is "polynomial."

 

 

Thank you in advance,

Answers

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,120  RM Data Scientist

    Hi,

     

    RapidMiner maps Polynomial data types to integers in the background. This is in many cases not noteworthy for a user. I would argue in your case it might be. Your case is maybe a bit different.

     

    One could try to do Sort + Append with only one input. This usually forces a remapping of all indices. But that would yield to an alphabetically sorted confusion matrix.

     

    Best.

    Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,625   Unicorn

    You could rename your categories with an integer preceding the actual name reflecting the order that you want, and then use @mschmitz 's sorting suggestion after that.  This is the only reliable way I have found for getting attributes with many nominal categories to appear in the order that I want in using RapidMiner.  It's a bit frustrating to do it that way but it will work.

     

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • PireheloPirehelo Member Posts: 12 Contributor I

    Telcontar120 and @mschmitz. Seems a bit frustrating, but I will try it for now. Best,

  • PireheloPirehelo Member Posts: 12 Contributor I

    @mschmitz

    Could you please elaborate on the idea of "Sort + Append with only one input"? I did not get it. 

     

    Thanks,

  • PireheloPirehelo Member Posts: 12 Contributor I

    @mschmitz

    Could you please elaborate on the idea of "Sort + Append with only one input"? I did not get it. 

     

    Thanks,

  • PireheloPirehelo Member Posts: 12 Contributor I

    Anyone? 

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,120  RM Data Scientist

    @Pirehelokan,

    attached is an example. It's ordering the classes alphabetically or inverse alphabetically.

     

    Cheers,

    Martin

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="8.0.001" expanded="true" height="68" name="Retrieve Sonar" width="90" x="112" y="136">
    <parameter key="repository_entry" value="//Samples/data/Sonar"/>
    </operator>
    <operator activated="true" class="sort" compatibility="8.0.001" expanded="true" height="82" name="Sort" width="90" x="246" y="136">
    <parameter key="attribute_name" value="class"/>
    <parameter key="sorting_direction" value="decreasing"/>
    </operator>
    <operator activated="true" class="append" compatibility="8.0.001" expanded="true" height="82" name="Append" width="90" x="380" y="136"/>
    <operator activated="true" class="concurrency:cross_validation" compatibility="8.0.001" expanded="true" height="145" name="Validation" width="90" x="514" y="136">
    <parameter key="sampling_type" value="stratified sampling"/>
    <process expanded="true">
    <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="8.0.001" expanded="true" height="82" name="Decision Tree" width="90" x="45" y="34"/>
    <connect from_port="training set" to_op="Decision Tree" to_port="training set"/>
    <connect from_op="Decision Tree" from_port="model" to_port="model"/>
    <portSpacing port="source_training set" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    <portSpacing port="sink_through 1" spacing="0"/>
    <description align="left" color="green" colored="true" height="80" resized="true" width="248" x="37" y="137">In the training phase, a model is built on the current training data set. (90 % of data by default, 10 times)</description>
    </process>
    <process expanded="true">
    <operator activated="true" class="apply_model" compatibility="8.0.001" expanded="true" height="82" name="Apply Model" width="90" x="45" y="34">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="performance" compatibility="8.0.001" expanded="true" height="82" name="Performance" width="90" x="179" y="34"/>
    <connect from_port="model" to_op="Apply Model" to_port="model"/>
    <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
    <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
    <connect from_op="Performance" from_port="performance" to_port="performance 1"/>
    <connect from_op="Performance" from_port="example set" to_port="test set results"/>
    <portSpacing port="source_model" spacing="0"/>
    <portSpacing port="source_test set" spacing="0"/>
    <portSpacing port="source_through 1" spacing="0"/>
    <portSpacing port="sink_test set results" spacing="0"/>
    <portSpacing port="sink_performance 1" spacing="0"/>
    <portSpacing port="sink_performance 2" spacing="0"/>
    <description align="left" color="blue" colored="true" height="103" resized="true" width="315" x="38" y="137">The model created in the Training step is applied to the current test set (10 %).&lt;br/&gt;The performance is evaluated and sent to the operator results.</description>
    </process>
    <description align="center" color="transparent" colored="false" width="126">A cross-validation evaluating a decision tree model.</description>
    </operator>
    <connect from_op="Retrieve Sonar" from_port="output" to_op="Sort" to_port="example set input"/>
    <connect from_op="Sort" from_port="example set output" to_op="Append" to_port="example set 1"/>
    <connect from_op="Append" from_port="merged set" to_op="Validation" to_port="example set"/>
    <connect from_op="Validation" from_port="performance 1" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
    sgenzer
Sign In or Register to comment.