The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.

order of class names in confusion matrix

PireheloPirehelo Member Posts: 12 Contributor II
edited December 2018 in Help

Hello fellow data analysts,

 

I was wondering if I can change the order of classes in a confusion matrix. For instance, I have five classes, which I want them to appear in conf matrix in the following order: Good, Satisfactory, Fair, Poor, Very Poor (i e their order matters). But now they are sorted this way: Very Poor, Fair, Satisfactory, Poor, Good. Based on what rationale rapidminer sorts the labels like this? Obviously, it is not alphabetical. PS the type of label attribute is "polynomial."

 

 

Thank you in advance,

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,525 RM Data Scientist

    Hi,

     

    RapidMiner maps Polynomial data types to integers in the background. This is in many cases not noteworthy for a user. I would argue in your case it might be. Your case is maybe a bit different.

     

    One could try to do Sort + Append with only one input. This usually forces a remapping of all indices. But that would yield to an alphabetically sorted confusion matrix.

     

    Best.

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Telcontar120Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    You could rename your categories with an integer preceding the actual name reflecting the order that you want, and then use @mschmitz 's sorting suggestion after that.  This is the only reliable way I have found for getting attributes with many nominal categories to appear in the order that I want in using RapidMiner.  It's a bit frustrating to do it that way but it will work.

     

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • PireheloPirehelo Member Posts: 12 Contributor II

    Telcontar120 and @mschmitz. Seems a bit frustrating, but I will try it for now. Best,

  • PireheloPirehelo Member Posts: 12 Contributor II

    @mschmitz

    Could you please elaborate on the idea of "Sort + Append with only one input"? I did not get it. 

     

    Thanks,

  • PireheloPirehelo Member Posts: 12 Contributor II

    @mschmitz

    Could you please elaborate on the idea of "Sort + Append with only one input"? I did not get it. 

     

    Thanks,

  • PireheloPirehelo Member Posts: 12 Contributor II

    Anyone? 

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,525 RM Data Scientist

    @Pirehelokan,

    attached is an example. It's ordering the classes alphabetically or inverse alphabetically.

     

    Cheers,

    Martin

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="8.0.001" expanded="true" height="68" name="Retrieve Sonar" width="90" x="112" y="136">
    <parameter key="repository_entry" value="//Samples/data/Sonar"/>
    </operator>
    <operator activated="true" class="sort" compatibility="8.0.001" expanded="true" height="82" name="Sort" width="90" x="246" y="136">
    <parameter key="attribute_name" value="class"/>
    <parameter key="sorting_direction" value="decreasing"/>
    </operator>
    <operator activated="true" class="append" compatibility="8.0.001" expanded="true" height="82" name="Append" width="90" x="380" y="136"/>
    <operator activated="true" class="concurrency:cross_validation" compatibility="8.0.001" expanded="true" height="145" name="Validation" width="90" x="514" y="136">
    <parameter key="sampling_type" value="stratified sampling"/>
    <process expanded="true">
    <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="8.0.001" expanded="true" height="82" name="Decision Tree" width="90" x="45" y="34"/>
    <connect from_port="training set" to_op="Decision Tree" to_port="training set"/>
    <connect from_op="Decision Tree" from_port="model" to_port="model"/>
    <portSpacing port="source_training set" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    <portSpacing port="sink_through 1" spacing="0"/>
    <description align="left" color="green" colored="true" height="80" resized="true" width="248" x="37" y="137">In the training phase, a model is built on the current training data set. (90 % of data by default, 10 times)</description>
    </process>
    <process expanded="true">
    <operator activated="true" class="apply_model" compatibility="8.0.001" expanded="true" height="82" name="Apply Model" width="90" x="45" y="34">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="performance" compatibility="8.0.001" expanded="true" height="82" name="Performance" width="90" x="179" y="34"/>
    <connect from_port="model" to_op="Apply Model" to_port="model"/>
    <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
    <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
    <connect from_op="Performance" from_port="performance" to_port="performance 1"/>
    <connect from_op="Performance" from_port="example set" to_port="test set results"/>
    <portSpacing port="source_model" spacing="0"/>
    <portSpacing port="source_test set" spacing="0"/>
    <portSpacing port="source_through 1" spacing="0"/>
    <portSpacing port="sink_test set results" spacing="0"/>
    <portSpacing port="sink_performance 1" spacing="0"/>
    <portSpacing port="sink_performance 2" spacing="0"/>
    <description align="left" color="blue" colored="true" height="103" resized="true" width="315" x="38" y="137">The model created in the Training step is applied to the current test set (10 %).&lt;br/&gt;The performance is evaluated and sent to the operator results.</description>
    </process>
    <description align="center" color="transparent" colored="false" width="126">A cross-validation evaluating a decision tree model.</description>
    </operator>
    <connect from_op="Retrieve Sonar" from_port="output" to_op="Sort" to_port="example set input"/>
    <connect from_op="Sort" from_port="example set output" to_op="Append" to_port="example set 1"/>
    <connect from_op="Append" from_port="merged set" to_op="Validation" to_port="example set"/>
    <connect from_op="Validation" from_port="performance 1" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.