Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Export results of different models
pijpopswouter
Member Posts: 1 Learner III
Hi,
I am beginning with rapidminer and I am trying to test out several classification models where I want to combine the testresults in one nice overview.
at the bottom I have inserted my code but I will explain my setting a little bit more.
I have made a training and a test set. Both have polynominal labels & several real/integer attributes. I am training several models with the same trainingset and apply that model to the test set. In the subprocesses I do the training, applying and the performance. Now I am doing only 3 models but that will be more in the future. For each model I want to extract the performance(confusion matrix, accuracy precision, recall, etc.) and put them all together in nice overview so I can see which one performs the best.
Note : the splitting of training and test data was necessary, due preprocessing the data in matlab and some training and testing must be kept apart
I have tried :
- the resultfile of the process but that stays empty.
- write as text, that exported the confusionmatrix, and info like recall, precision, ... but only that info about one class.
- write performance. That gives an XML with all the possible results buts need external processing for each file/model
So my question is : Are there rapidminer operators that I have missed that can help me creating a nice performance overview of several models?
In the future there will also be looped over 40 testing and training sets, so extra tips or doubts about my process may be given.
Thanks in advance & kind regards,
Wouter
I am beginning with rapidminer and I am trying to test out several classification models where I want to combine the testresults in one nice overview.
at the bottom I have inserted my code but I will explain my setting a little bit more.
I have made a training and a test set. Both have polynominal labels & several real/integer attributes. I am training several models with the same trainingset and apply that model to the test set. In the subprocesses I do the training, applying and the performance. Now I am doing only 3 models but that will be more in the future. For each model I want to extract the performance(confusion matrix, accuracy precision, recall, etc.) and put them all together in nice overview so I can see which one performs the best.
Note : the splitting of training and test data was necessary, due preprocessing the data in matlab and some training and testing must be kept apart
I have tried :
- the resultfile of the process but that stays empty.
- write as text, that exported the confusionmatrix, and info like recall, precision, ... but only that info about one class.
- write performance. That gives an XML with all the possible results buts need external processing for each file/model
So my question is : Are there rapidminer operators that I have missed that can help me creating a nice performance overview of several models?
In the future there will also be looped over 40 testing and training sets, so extra tips or doubts about my process may be given.
Thanks in advance & kind regards,
Wouter
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
<parameter key="logverbosity" value="off"/>
<parameter key="resultfile" value="C:\Users\Wouter\resultfile"/>
<parameter key="encoding" value="UTF-8"/>
<process expanded="true" height="852" width="1100">
<operator activated="true" class="retrieve" compatibility="5.2.008" expanded="true" height="60" name="Retrieve" width="90" x="45" y="75">
<parameter key="repository_entry" value="//TrainingData/R200_poly_Training_PREV_POST_RR"/>
</operator>
<operator activated="true" class="multiply" compatibility="5.2.008" expanded="true" height="112" name="Multiply" width="90" x="246" y="75"/>
<operator activated="true" class="retrieve" compatibility="5.2.008" expanded="true" height="60" name="Retrieve (2)" width="90" x="52" y="211">
<parameter key="repository_entry" value="//TestData/R200_poly_Test_PREV_POST_RR"/>
</operator>
<operator activated="true" class="multiply" compatibility="5.2.008" expanded="true" height="112" name="Multiply (2)" width="90" x="246" y="210"/>
<operator activated="true" class="subprocess" compatibility="5.2.008" expanded="true" height="94" name="Subprocess (3)" width="90" x="447" y="345">
<process expanded="true" height="852" width="1100">
<operator activated="true" class="weka:W-J48graft" compatibility="5.1.001" expanded="true" height="76" name="W-J48graft" width="90" x="45" y="30"/>
<operator activated="true" class="apply_model" compatibility="5.2.008" expanded="true" height="76" name="Apply Model (3)" width="90" x="447" y="75">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance" compatibility="5.2.008" expanded="true" height="76" name="Performance (3)" width="90" x="648" y="75"/>
<connect from_port="in 1" to_op="W-J48graft" to_port="training set"/>
<connect from_port="in 2" to_op="Apply Model (3)" to_port="unlabelled data"/>
<connect from_op="W-J48graft" from_port="model" to_op="Apply Model (3)" to_port="model"/>
<connect from_op="Apply Model (3)" from_port="labelled data" to_op="Performance (3)" to_port="labelled data"/>
<connect from_op="Performance (3)" from_port="performance" to_port="out 1"/>
<portSpacing port="source_in 1" spacing="0"/>
<portSpacing port="source_in 2" spacing="0"/>
<portSpacing port="source_in 3" spacing="0"/>
<portSpacing port="sink_out 1" spacing="0"/>
<portSpacing port="sink_out 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="subprocess" compatibility="5.2.008" expanded="true" height="94" name="Subprocess" width="90" x="447" y="75">
<process expanded="true" height="852" width="1100">
<operator activated="true" class="support_vector_machine_libsvm" compatibility="5.2.008" expanded="true" height="76" name="SVM" width="90" x="246" y="75">
<list key="class_weights"/>
</operator>
<operator activated="true" class="apply_model" compatibility="5.2.008" expanded="true" height="76" name="Apply Model" width="90" x="380" y="75">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance" compatibility="5.2.008" expanded="true" height="76" name="Performance" width="90" x="581" y="75"/>
<connect from_port="in 1" to_op="SVM" to_port="training set"/>
<connect from_port="in 2" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="SVM" from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Apply Model" from_port="model" to_port="out 2"/>
<connect from_op="Performance" from_port="performance" to_port="out 1"/>
<portSpacing port="source_in 1" spacing="0"/>
<portSpacing port="source_in 2" spacing="0"/>
<portSpacing port="source_in 3" spacing="0"/>
<portSpacing port="sink_out 1" spacing="0"/>
<portSpacing port="sink_out 2" spacing="0"/>
<portSpacing port="sink_out 3" spacing="90"/>
</process>
</operator>
<operator activated="true" class="write_performance" compatibility="5.2.008" expanded="true" height="60" name="Write Performance" width="90" x="648" y="75">
<parameter key="performance_file" value="C:\Users\Wouter\Documents\results.xml"/>
</operator>
<operator activated="true" class="subprocess" compatibility="5.2.008" expanded="true" height="94" name="Subprocess (2)" width="90" x="447" y="210">
<process expanded="true" height="852" width="1100">
<operator activated="true" class="weka:W-LADTree" compatibility="5.1.001" expanded="true" height="76" name="W-LADTree" width="90" x="112" y="30"/>
<operator activated="true" class="apply_model" compatibility="5.2.008" expanded="true" height="76" name="Apply Model (2)" width="90" x="380" y="75">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance" compatibility="5.2.008" expanded="true" height="76" name="Performance (2)" width="90" x="581" y="75"/>
<connect from_port="in 1" to_op="W-LADTree" to_port="training set"/>
<connect from_port="in 2" to_op="Apply Model (2)" to_port="unlabelled data"/>
<connect from_op="W-LADTree" from_port="model" to_op="Apply Model (2)" to_port="model"/>
<connect from_op="Apply Model (2)" from_port="labelled data" to_op="Performance (2)" to_port="labelled data"/>
<connect from_op="Apply Model (2)" from_port="model" to_port="out 2"/>
<connect from_op="Performance (2)" from_port="performance" to_port="out 1"/>
<portSpacing port="source_in 1" spacing="0"/>
<portSpacing port="source_in 2" spacing="0"/>
<portSpacing port="source_in 3" spacing="0"/>
<portSpacing port="sink_out 1" spacing="0"/>
<portSpacing port="sink_out 2" spacing="0"/>
<portSpacing port="sink_out 3" spacing="0"/>
</process>
</operator>
<operator activated="true" class="write_as_text" compatibility="5.2.008" expanded="true" height="76" name="Write as Text" width="90" x="634" y="207"/>
<connect from_op="Retrieve" from_port="output" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_op="Subprocess" to_port="in 1"/>
<connect from_op="Multiply" from_port="output 2" to_op="Subprocess (2)" to_port="in 1"/>
<connect from_op="Multiply" from_port="output 3" to_op="Subprocess (3)" to_port="in 1"/>
<connect from_op="Retrieve (2)" from_port="output" to_op="Multiply (2)" to_port="input"/>
<connect from_op="Multiply (2)" from_port="output 1" to_op="Subprocess" to_port="in 2"/>
<connect from_op="Multiply (2)" from_port="output 2" to_op="Subprocess (2)" to_port="in 2"/>
<connect from_op="Multiply (2)" from_port="output 3" to_op="Subprocess (3)" to_port="in 2"/>
<connect from_op="Subprocess (3)" from_port="out 1" to_port="result 2"/>
<connect from_op="Subprocess" from_port="out 1" to_op="Write Performance" to_port="input"/>
<connect from_op="Write Performance" from_port="through" to_port="result 3"/>
<connect from_op="Subprocess (2)" from_port="out 1" to_op="Write as Text" to_port="input 1"/>
<connect from_op="Write as Text" from_port="input 1" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
0
Answers
currently this is not easily possible. The next version of RapidMiner however, expected to be released withing the next 2-3 weeks, will feature a Performance to Data operator, which creates an example set from a performance vector in the format
Precision 0.9
Recall 0.8
Accuracy 0.75
...
That should give you the possibility to create nice report/overview tables for all models.
Best regards,
Marius