The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.

Exporting perfomance and images _optimizing

blueearthblueearth Member Posts: 42 Contributor II
edited November 2018 in Help
1-is there any oprators which export performances and accuracies and images directly into exel and jpg files recpectivley?
2-can optimizing oprators connect to modeling oprators directly and optimize their functions automaticly before they start oprating datas?

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi,

    I have two short answers as a teaser, one hint and then the long answer: ;-)

    Short answers:
    1-performances yes, images partly
    2-yes

    Hint: next time, please open a new thread for each question, that keeps the forum structure cleaner and allows more focused discussion.

    And now the long answers:
    You probably know the Log operator. With that operator, you can log values and parameters of previously executed operators. By default, you can't export the output of the Log operator, but you can use a trick: the Log to Data operator converts the output of the Log operator into an example set, which can then be exported to excel with Write Excel. To export images and graphs, you should have a look at the reporting extension.

    We have several optimization operators, depending on what you want to achieve: Optimize Parameters optimizes parameters of models, e.g. the margin width of an SVM or the depth and pruning options of a Decision Tree. The various Optimize Selection flavors in the operator group Data Transformation/Attribute Set Reduction/Selection/Optimization perform a feature selection.

    The common property of all these operators is, that they contain an inner process (open it by double clicking the operator after dragging it onto the grid), which usually trains a model, evaluates it (e.g. via X-Validation), and measures its performance. The performance is then passed to the optimization operator by connecting its output to the "per" output on the right side of the optimize operators inner process.

    Please have a look at the following process to see all mentioned operators working together.

    Best, Marius
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.002">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.2.002" expanded="true" name="Process">
        <process expanded="true" height="507" width="983">
          <operator activated="true" class="retrieve" compatibility="5.2.002" expanded="true" height="60" name="Retrieve" width="90" x="76" y="24">
            <parameter key="repository_entry" value="//Samples/data/Sonar"/>
          </operator>
          <operator activated="true" class="optimize_parameters_grid" compatibility="5.2.002" expanded="true" height="94" name="Optimize Parameters (Grid)" width="90" x="246" y="30">
            <list key="parameters">
              <parameter key="SVM.C" value="[0.001;1;10;logarithmic]"/>
            </list>
            <process expanded="true" height="507" width="983">
              <operator activated="true" class="x_validation" compatibility="5.2.002" expanded="true" height="112" name="Validation" width="90" x="112" y="30">
                <process expanded="true" height="507" width="466">
                  <operator activated="true" class="support_vector_machine" compatibility="5.2.002" expanded="true" height="112" name="SVM" width="90" x="112" y="30">
                    <parameter key="C" value="1.0"/>
                  </operator>
                  <connect from_port="training" to_op="SVM" to_port="training set"/>
                  <connect from_op="SVM" from_port="model" to_port="model"/>
                  <portSpacing port="source_training" spacing="0"/>
                  <portSpacing port="sink_model" spacing="0"/>
                  <portSpacing port="sink_through 1" spacing="0"/>
                </process>
                <process expanded="true" height="507" width="466">
                  <operator activated="true" class="apply_model" compatibility="5.2.002" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
                    <list key="application_parameters"/>
                  </operator>
                  <operator activated="true" class="performance" compatibility="5.2.002" expanded="true" height="76" name="Performance" width="90" x="246" y="30"/>
                  <connect from_port="model" to_op="Apply Model" to_port="model"/>
                  <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
                  <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
                  <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
                  <portSpacing port="source_model" spacing="0"/>
                  <portSpacing port="source_test set" spacing="0"/>
                  <portSpacing port="source_through 1" spacing="0"/>
                  <portSpacing port="sink_averagable 1" spacing="0"/>
                  <portSpacing port="sink_averagable 2" spacing="0"/>
                </process>
              </operator>
              <operator activated="true" class="log" compatibility="5.2.002" expanded="true" height="76" name="Log" width="90" x="380" y="30">
                <list key="log">
                  <parameter key="C" value="operator.SVM.parameter.C"/>
                  <parameter key="performance" value="operator.Validation.value.performance"/>
                  <parameter key="stdev" value="operator.Validation.value.deviation"/>
                </list>
              </operator>
              <connect from_port="input 1" to_op="Validation" to_port="training"/>
              <connect from_op="Validation" from_port="averagable 1" to_op="Log" to_port="through 1"/>
              <connect from_op="Log" from_port="through 1" to_port="performance"/>
              <portSpacing port="source_input 1" spacing="0"/>
              <portSpacing port="source_input 2" spacing="0"/>
              <portSpacing port="sink_performance" spacing="0"/>
              <portSpacing port="sink_result 1" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" breakpoints="after" class="log_to_data" compatibility="5.2.002" expanded="true" height="94" name="Log to Data" width="90" x="447" y="30">
            <parameter key="log_name" value="Log"/>
          </operator>
          <operator activated="true" class="write_excel" compatibility="5.2.002" expanded="true" height="76" name="Write Excel" width="90" x="648" y="30">
            <parameter key="excel_file" value="C:\Users\mhelf\Desktop\test.xls"/>
          </operator>
          <connect from_op="Retrieve" from_port="output" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
          <connect from_op="Optimize Parameters (Grid)" from_port="performance" to_op="Log to Data" to_port="through 1"/>
          <connect from_op="Log to Data" from_port="exampleSet" to_op="Write Excel" to_port="input"/>
          <connect from_op="Write Excel" from_port="through" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
  • blueearthblueearth Member Posts: 42 Contributor II
    Thanx for your complete answer .....i tried log data oprator on clustering... it worked...but i have a problem ...the exported file does not give me range and statistics and etc....some columns are missed in exported file...and the excel file is not viable
    is there any thing that need to be changed?
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Can you please post your process as described here? -> http://rapid-i.com/rapidforum/index.php/topic,4782.0.html
    Then I can have a look at your setup to see what may have to be changed.

    Best, Marius
  • blueearthblueearth Member Posts: 42 Contributor II
    before codes i have another question ...i copied the code u gave to me it gave me different C parameters with their performances ..what i meant was is there any way for operators to choose the best parameters and transfer it to operator and i get best performance directly with out applying two process?(one for choosing parameters and another one to get result )
    and thanxz i solved the clustering problem now i have another issue


    here is validation tree code here ...i donn know what should i do to get accuracy performance i tried all of performances but non of them is the accuracy performance
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.001">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="5.2.001" expanded="true" name="Process">
       <process expanded="true" height="389" width="701">
         <operator activated="true" class="retrieve" compatibility="5.2.001" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
           <parameter key="repository_entry" value="//zebra/Results/Attribute Weighting/Chi Squared"/>
         </operator>
         <operator activated="true" class="x_validation" compatibility="5.2.001" expanded="true" height="112" name="DT Gain Ratio" width="90" x="179" y="75">
           <parameter key="use_local_random_seed" value="true"/>
           <process expanded="true" height="389" width="263">
             <operator activated="true" class="parallel:decision_tree_parallel" compatibility="5.1.000" expanded="true" height="76" name="Decision Tree" width="90" x="86" y="30">
               <parameter key="criterion" value="accuracy"/>
               <parameter key="minimal_gain" value="Infinity"/>
               <parameter key="number_of_threads" value="2"/>
             </operator>
             <connect from_port="training" to_op="Decision Tree" to_port="training set"/>
             <connect from_op="Decision Tree" from_port="model" to_port="model"/>
             <portSpacing port="source_training" spacing="0"/>
             <portSpacing port="sink_model" spacing="0"/>
             <portSpacing port="sink_through 1" spacing="0"/>
           </process>
           <process expanded="true" height="389" width="263">
             <operator activated="true" class="apply_model" compatibility="5.2.001" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
               <list key="application_parameters"/>
             </operator>
             <operator activated="true" class="performance" compatibility="5.2.001" expanded="true" height="76" name="Performance" width="90" x="154" y="30"/>
             <connect from_port="model" to_op="Apply Model" to_port="model"/>
             <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
             <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
             <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
             <portSpacing port="source_model" spacing="0"/>
             <portSpacing port="source_test set" spacing="0"/>
             <portSpacing port="source_through 1" spacing="0"/>
             <portSpacing port="sink_averagable 1" spacing="0"/>
             <portSpacing port="sink_averagable 2" spacing="0"/>
           </process>
         </operator>
         <operator activated="true" class="log" compatibility="5.2.001" expanded="true" height="76" name="Log" width="90" x="313" y="75">
           <list key="log">
             <parameter key="Gain ratio" value="operator.DT Gain Ratio.value.performance"/>
             <parameter key="1" value="operator.DT Gain Ratio.value.performance1"/>
             <parameter key="f" value="operator.DT Gain Ratio.value.performance2"/>
             <parameter key="ff" value="operator.DT Gain Ratio.value.performance3"/>
             <parameter key="ffff" value="operator.DT Gain Ratio.value.performance"/>
           </list>
         </operator>
         <operator activated="true" breakpoints="after" class="log_to_data" compatibility="5.2.001" expanded="true" height="94" name="Log to Data" width="90" x="447" y="30">
           <parameter key="log_name" value="Log"/>
         </operator>
         <operator activated="true" class="write_excel" compatibility="5.2.001" expanded="true" height="76" name="Write Excel" width="90" x="581" y="30">
           <parameter key="excel_file" value="C:\Users\Amir Hossein\Documents\test.xls"/>
         </operator>
         <connect from_op="Retrieve" from_port="output" to_op="DT Gain Ratio" to_port="training"/>
         <connect from_op="DT Gain Ratio" from_port="averagable 1" to_op="Log" to_port="through 1"/>
         <connect from_op="Log" from_port="through 1" to_op="Log to Data" to_port="through 1"/>
         <connect from_op="Log to Data" from_port="exampleSet" to_op="Write Excel" to_port="input"/>
         <connect from_op="Write Excel" from_port="through" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
    </process>

    why when i run optimizing operator it pause it self after it was done and i have to play it again to continue and write exel file?

    I'm sorry I'm new user of rapid miner so I'm not export in its operators
    Thanks for your help
Sign In or Register to comment.