Options

"which ROC curve comes together with which model?"

dan_agapedan_agape Member Posts: 106 Maven
edited June 2019 in Help

Hi again

If you display the performance information regarding several classification models – how do you distinguish which ROC curve corresponds to which model? I do not see anything on the ROC curve windows identifying the corresponding model. Thanks for your suggestions/comments

Dan
Tagged:

Answers

  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    normally the result tabs show the names of the operators that created the respective result. So you might take a look there?

    Greetings,
      Sebastian
  • Options
    dan_agapedan_agape Member Posts: 106 Maven
    Sebastian, thanks. However I am still confused. Do you mean the correspondence between a model and its performance vector (if you build and evaluate several models in the same canvas) is made via the name of the operators that built these? In this case the operator for evaluation is the same. i.e. "Performance". Some numbers are also included in the tabs of these windows - do you mean these numbers are to be used to know which performance correspond to which model?

    Greetings
    Dan
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Dan,
    now I'm confused :) Let's make a common base to talk about: Please replace the data source you use in your process by an Generate Data operator and then paste the process here in the code area. The code area can be inserted with the # button above.
    Then I will see what you see and probably will understand better what we are talking about :)

    Greetings,
      Sebastian
  • Options
    dan_agapedan_agape Member Posts: 106 Maven
    Sebastian, here is a process building 3 models and their performances. Just to repeat the question: how do you know which performance corresponds to which model in the result windows?

    Thanks
    Dan
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.0.8" expanded="true" name="Process">
        <parameter key="logverbosity" value="3"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="1"/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <parameter key="parallelize_main_process" value="false"/>
        <process expanded="true" height="467" width="465">
          <operator activated="true" class="generate_churn_data" compatibility="5.0.8" expanded="true" height="60" name="Generate Churn Data" width="90" x="45" y="165">
            <parameter key="number_examples" value="1000"/>
            <parameter key="use_local_random_seed" value="false"/>
            <parameter key="local_random_seed" value="1992"/>
          </operator>
          <operator activated="true" class="multiply" compatibility="5.0.8" expanded="true" height="112" name="Multiply" width="90" x="179" y="165"/>
          <operator activated="true" class="split_validation" compatibility="5.0.8" expanded="true" height="112" name="Validation (2)" width="90" x="313" y="165">
            <parameter key="create_complete_model" value="false"/>
            <parameter key="split" value="1"/>
            <parameter key="split_ratio" value="0.7"/>
            <parameter key="training_set_size" value="100"/>
            <parameter key="test_set_size" value="-1"/>
            <parameter key="sampling_type" value="1"/>
            <parameter key="use_local_random_seed" value="false"/>
            <parameter key="local_random_seed" value="1992"/>
            <parameter key="parallelize_training" value="false"/>
            <parameter key="parallelize_testing" value="false"/>
            <process expanded="true" height="422" width="243">
              <operator activated="true" class="k_nn" compatibility="5.0.8" expanded="true" height="76" name="k-NN" width="90" x="112" y="30">
                <parameter key="k" value="5"/>
                <parameter key="weighted_vote" value="false"/>
                <parameter key="measure_types" value="0"/>
                <parameter key="mixed_measure" value="0"/>
                <parameter key="nominal_measure" value="0"/>
                <parameter key="numerical_measure" value="0"/>
                <parameter key="divergence" value="0"/>
                <parameter key="kernel_type" value="1"/>
                <parameter key="kernel_gamma" value="1.0"/>
                <parameter key="kernel_sigma1" value="1.0"/>
                <parameter key="kernel_sigma2" value="0.0"/>
                <parameter key="kernel_sigma3" value="2.0"/>
                <parameter key="kernel_degree" value="3.0"/>
                <parameter key="kernel_shift" value="1.0"/>
                <parameter key="kernel_a" value="1.0"/>
                <parameter key="kernel_b" value="0.0"/>
              </operator>
              <connect from_port="training" to_op="k-NN" to_port="training set"/>
              <connect from_op="k-NN" from_port="model" to_port="model"/>
              <portSpacing port="source_training" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true" height="422" width="243">
              <operator activated="true" class="apply_model" compatibility="5.0.8" expanded="true" height="76" name="Apply Model (2)" width="90" x="24" y="29">
                <list key="application_parameters"/>
                <parameter key="create_view" value="false"/>
              </operator>
              <operator activated="true" class="performance" compatibility="5.0.8" expanded="true" height="76" name="Performance (2)" width="90" x="108" y="123">
                <parameter key="use_example_weights" value="true"/>
              </operator>
              <connect from_port="model" to_op="Apply Model (2)" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model (2)" to_port="unlabelled data"/>
              <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Performance (2)" to_port="labelled data"/>
              <connect from_op="Performance (2)" from_port="performance" to_port="averagable 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_averagable 1" spacing="0"/>
              <portSpacing port="sink_averagable 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="split_validation" compatibility="5.0.8" expanded="true" height="112" name="Validation" width="90" x="313" y="30">
            <parameter key="create_complete_model" value="false"/>
            <parameter key="split" value="1"/>
            <parameter key="split_ratio" value="0.7"/>
            <parameter key="training_set_size" value="100"/>
            <parameter key="test_set_size" value="-1"/>
            <parameter key="sampling_type" value="stratified sampling"/>
            <parameter key="use_local_random_seed" value="false"/>
            <parameter key="local_random_seed" value="1992"/>
            <parameter key="parallelize_training" value="false"/>
            <parameter key="parallelize_testing" value="false"/>
            <process expanded="true" height="422" width="243">
              <operator activated="true" class="decision_tree" compatibility="5.0.8" expanded="true" height="76" name="Decision Tree" width="90" x="45" y="30">
                <parameter key="criterion" value="gain_ratio"/>
                <parameter key="minimal_size_for_split" value="4"/>
                <parameter key="minimal_leaf_size" value="2"/>
                <parameter key="minimal_gain" value="0.1"/>
                <parameter key="maximal_depth" value="20"/>
                <parameter key="confidence" value="0.25"/>
                <parameter key="number_of_prepruning_alternatives" value="3"/>
                <parameter key="no_pre_pruning" value="false"/>
                <parameter key="no_pruning" value="false"/>
              </operator>
              <connect from_port="training" to_op="Decision Tree" to_port="training set"/>
              <connect from_op="Decision Tree" from_port="model" to_port="model"/>
              <portSpacing port="source_training" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true" height="422" width="279">
              <operator activated="true" class="apply_model" compatibility="5.0.8" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
                <list key="application_parameters"/>
                <parameter key="create_view" value="false"/>
              </operator>
              <operator activated="true" class="performance" compatibility="5.0.8" expanded="true" height="76" name="Performance" width="90" x="112" y="165">
                <parameter key="use_example_weights" value="true"/>
              </operator>
              <connect from_port="model" to_op="Apply Model" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
              <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
              <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_averagable 1" spacing="0"/>
              <portSpacing port="sink_averagable 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="split_validation" compatibility="5.0.8" expanded="true" height="130" name="Validation (4)" width="90" x="313" y="300">
            <parameter key="create_complete_model" value="false"/>
            <parameter key="split" value="1"/>
            <parameter key="split_ratio" value="0.7"/>
            <parameter key="training_set_size" value="100"/>
            <parameter key="test_set_size" value="-1"/>
            <parameter key="sampling_type" value="1"/>
            <parameter key="use_local_random_seed" value="false"/>
            <parameter key="local_random_seed" value="1992"/>
            <parameter key="parallelize_training" value="false"/>
            <parameter key="parallelize_testing" value="false"/>
            <process expanded="true" height="422" width="243">
              <operator activated="true" class="naive_bayes" compatibility="5.0.8" expanded="true" height="76" name="Naive Bayes (2)" width="90" x="81" y="30">
                <parameter key="laplace_correction" value="true"/>
              </operator>
              <connect from_port="training" to_op="Naive Bayes (2)" to_port="training set"/>
              <connect from_op="Naive Bayes (2)" from_port="model" to_port="model"/>
              <portSpacing port="source_training" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true" height="422" width="279">
              <operator activated="true" class="apply_model" compatibility="5.0.8" expanded="true" height="76" name="Apply Model (4)" width="90" x="45" y="30">
                <list key="application_parameters"/>
                <parameter key="create_view" value="false"/>
              </operator>
              <operator activated="true" class="performance" compatibility="5.0.8" expanded="true" height="76" name="Performance (4)" width="90" x="112" y="165">
                <parameter key="use_example_weights" value="true"/>
              </operator>
              <connect from_port="model" to_op="Apply Model (4)" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model (4)" to_port="unlabelled data"/>
              <connect from_op="Apply Model (4)" from_port="labelled data" to_op="Performance (4)" to_port="labelled data"/>
              <connect from_op="Performance (4)" from_port="performance" to_port="averagable 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_averagable 1" spacing="0"/>
              <portSpacing port="sink_averagable 2" spacing="0"/>
              <portSpacing port="sink_averagable 3" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Generate Churn Data" from_port="output" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Validation" to_port="training"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Validation (2)" to_port="training"/>
          <connect from_op="Multiply" from_port="output 3" to_op="Validation (4)" to_port="training"/>
          <connect from_op="Validation (2)" from_port="averagable 1" to_port="result 4"/>
          <connect from_op="Validation" from_port="model" to_port="result 1"/>
          <connect from_op="Validation" from_port="training" to_port="result 2"/>
          <connect from_op="Validation" from_port="averagable 1" to_port="result 3"/>
          <connect from_op="Validation (4)" from_port="model" to_port="result 5"/>
          <connect from_op="Validation (4)" from_port="averagable 1" to_port="result 6"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
          <portSpacing port="sink_result 4" spacing="108"/>
          <portSpacing port="sink_result 5" spacing="72"/>
          <portSpacing port="sink_result 6" spacing="18"/>
          <portSpacing port="sink_result 7" spacing="0"/>
        </process>
      </operator>
    </process>
  • Options
    haddockhaddock Member Posts: 849 Maven
    Just to repeat the question: how do you know which performance corresponds to which model in the result windows?
    Just to repeat the answer....
    normally the result tabs show the names of the operators that created the respective result. So you might take a look there?
    Like this...
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.0.8" expanded="true" name="Process">
        <process expanded="true" height="467" width="465">
          <operator activated="true" class="generate_churn_data" compatibility="5.0.8" expanded="true" height="60" name="Generate Churn Data" width="90" x="45" y="165">
            <parameter key="number_examples" value="1000"/>
          </operator>
          <operator activated="true" class="multiply" compatibility="5.0.8" expanded="true" height="112" name="Multiply" width="90" x="179" y="165"/>
          <operator activated="true" class="split_validation" compatibility="5.0.8" expanded="true" height="112" name="Validation (2)" width="90" x="313" y="165">
            <process expanded="true" height="422" width="243">
              <operator activated="true" class="k_nn" compatibility="5.0.8" expanded="true" height="76" name="k-NN" width="90" x="112" y="30">
                <parameter key="k" value="5"/>
              </operator>
              <connect from_port="training" to_op="k-NN" to_port="training set"/>
              <connect from_op="k-NN" from_port="model" to_port="model"/>
              <portSpacing port="source_training" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true" height="422" width="243">
              <operator activated="true" class="apply_model" compatibility="5.0.8" expanded="true" height="76" name="Apply Model (2)" width="90" x="24" y="29">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="performance" compatibility="5.0.8" expanded="true" height="76" name="B" width="90" x="108" y="123"/>
              <connect from_port="model" to_op="Apply Model (2)" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model (2)" to_port="unlabelled data"/>
              <connect from_op="Apply Model (2)" from_port="labelled data" to_op="B" to_port="labelled data"/>
              <connect from_op="B" from_port="performance" to_port="averagable 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_averagable 1" spacing="0"/>
              <portSpacing port="sink_averagable 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="split_validation" compatibility="5.0.8" expanded="true" height="112" name="Validation" width="90" x="313" y="30">
            <parameter key="sampling_type" value="stratified sampling"/>
            <process expanded="true" height="422" width="243">
              <operator activated="true" class="decision_tree" compatibility="5.0.8" expanded="true" height="76" name="Decision Tree" width="90" x="45" y="30"/>
              <connect from_port="training" to_op="Decision Tree" to_port="training set"/>
              <connect from_op="Decision Tree" from_port="model" to_port="model"/>
              <portSpacing port="source_training" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true" height="422" width="279">
              <operator activated="true" class="apply_model" compatibility="5.0.8" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="performance" compatibility="5.0.8" expanded="true" height="76" name="A" width="90" x="112" y="165"/>
              <connect from_port="model" to_op="Apply Model" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
              <connect from_op="Apply Model" from_port="labelled data" to_op="A" to_port="labelled data"/>
              <connect from_op="A" from_port="performance" to_port="averagable 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_averagable 1" spacing="0"/>
              <portSpacing port="sink_averagable 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="split_validation" compatibility="5.0.8" expanded="true" height="112" name="Validation (4)" width="90" x="313" y="300">
            <process expanded="true" height="422" width="243">
              <operator activated="true" class="naive_bayes" compatibility="5.0.8" expanded="true" height="76" name="Naive Bayes (2)" width="90" x="81" y="30"/>
              <connect from_port="training" to_op="Naive Bayes (2)" to_port="training set"/>
              <connect from_op="Naive Bayes (2)" from_port="model" to_port="model"/>
              <portSpacing port="source_training" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true" height="422" width="279">
              <operator activated="true" class="apply_model" compatibility="5.0.8" expanded="true" height="76" name="Apply Model (4)" width="90" x="45" y="30">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="performance" compatibility="5.0.8" expanded="true" height="76" name="C" width="90" x="112" y="165"/>
              <connect from_port="model" to_op="Apply Model (4)" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model (4)" to_port="unlabelled data"/>
              <connect from_op="Apply Model (4)" from_port="labelled data" to_op="C" to_port="labelled data"/>
              <connect from_op="C" from_port="performance" to_port="averagable 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_averagable 1" spacing="0"/>
              <portSpacing port="sink_averagable 2" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Generate Churn Data" from_port="output" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Validation" to_port="training"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Validation (2)" to_port="training"/>
          <connect from_op="Multiply" from_port="output 3" to_op="Validation (4)" to_port="training"/>
          <connect from_op="Validation (2)" from_port="averagable 1" to_port="result 4"/>
          <connect from_op="Validation" from_port="model" to_port="result 1"/>
          <connect from_op="Validation" from_port="training" to_port="result 2"/>
          <connect from_op="Validation" from_port="averagable 1" to_port="result 3"/>
          <connect from_op="Validation (4)" from_port="model" to_port="result 5"/>
          <connect from_op="Validation (4)" from_port="averagable 1" to_port="result 6"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
          <portSpacing port="sink_result 4" spacing="108"/>
          <portSpacing port="sink_result 5" spacing="72"/>
          <portSpacing port="sink_result 6" spacing="18"/>
          <portSpacing port="sink_result 7" spacing="0"/>
        </process>
      </operator>
    </process>
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    as I said, the name of the generating operator is shown in the tabs. Since you did not rename your performance estimating operators they are simply numbered: Performance, Performance (2) and Performance (4). The tab names in the result perspectives are "PerformanceVector (Performance)", "PerformanceVector (Performance (2))"...
    So if you give your operators meaningful names, you will know which tab represents which result...

    Greetings,
      Sebastian
  • Options
    dan_agapedan_agape Member Posts: 106 Maven
    Sebastian thanks!
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    No Problem :)
Sign In or Register to comment.