"Cost and Gamma SVM"

DocMusherDocMusher Member Posts: 243   Unicorn
edited June 8 in Help
Hi,
What is the best and easiest? way to make a cost vs gamma curve (color) ?
Cheers
Sven
Tagged:

Answers

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,049  RM Data Scientist
    Hi Sven,

    you can use a log operator and log C,gamma and performance. Than you can use standard charts on the log (in results view).

    attached is a process on Iris. I use this optimize also as a building block.

    Cheers,
    Martin

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.4.000">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="6.4.000" expanded="true" name="Process">
       <process expanded="true">
         <operator activated="true" class="retrieve" compatibility="6.4.000" expanded="true" height="60" name="Retrieve Iris" width="90" x="112" y="120">
           <parameter key="repository_entry" value="//Samples/data/Iris"/>
         </operator>
         <operator activated="true" class="optimize_parameters_grid" compatibility="6.4.000" expanded="true" height="94" name="Optimize Parameters (Grid)" width="90" x="246" y="120">
           <list key="parameters">
             <parameter key="SVM.C" value="[1e-3;10;4;logarithmic]"/>
             <parameter key="SVM.gamma" value="[1e-3;10;4;logarithmic]"/>
           </list>
           <process expanded="true">
             <operator activated="true" class="x_validation" compatibility="5.1.002" expanded="true" height="112" name="Validation" width="90" x="45" y="30">
               <parameter key="sampling_type" value="2"/>
               <process expanded="true">
                 <operator activated="true" class="support_vector_machine_libsvm" compatibility="6.4.000" expanded="true" height="76" name="SVM" width="90" x="112" y="30">
                   <parameter key="gamma" value="10.0"/>
                   <parameter key="C" value="10.0"/>
                   <list key="class_weights"/>
                 </operator>
                 <connect from_port="training" to_op="SVM" to_port="training set"/>
                 <connect from_op="SVM" from_port="model" to_port="model"/>
                 <portSpacing port="source_training" spacing="0"/>
                 <portSpacing port="sink_model" spacing="0"/>
                 <portSpacing port="sink_through 1" spacing="0"/>
               </process>
               <process expanded="true">
                 <operator activated="true" class="apply_model" compatibility="6.4.000" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
                   <list key="application_parameters"/>
                 </operator>
                 <operator activated="true" class="performance" compatibility="6.4.000" expanded="true" height="76" name="Performance" width="90" x="179" y="30"/>
                 <connect from_port="model" to_op="Apply Model" to_port="model"/>
                 <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
                 <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
                 <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
                 <portSpacing port="source_model" spacing="0"/>
                 <portSpacing port="source_test set" spacing="0"/>
                 <portSpacing port="source_through 1" spacing="0"/>
                 <portSpacing port="sink_averagable 1" spacing="0"/>
                 <portSpacing port="sink_averagable 2" spacing="0"/>
               </process>
               <description align="center" color="transparent" colored="false" width="126">A cross-validation evaluating a decision tree model.</description>
             </operator>
             <operator activated="true" class="log" compatibility="6.4.000" expanded="true" height="76" name="Log" width="90" x="179" y="75">
               <list key="log">
                 <parameter key="C" value="operator.SVM.parameter.C"/>
                 <parameter key="gamma" value="operator.SVM.parameter.gamma"/>
                 <parameter key="Performance" value="operator.Validation.value.performance"/>
               </list>
             </operator>
             <connect from_port="input 1" to_op="Validation" to_port="training"/>
             <connect from_op="Validation" from_port="averagable 1" to_op="Log" to_port="through 1"/>
             <connect from_op="Log" from_port="through 1" to_port="performance"/>
             <portSpacing port="source_input 1" spacing="0"/>
             <portSpacing port="source_input 2" spacing="0"/>
             <portSpacing port="sink_performance" spacing="0"/>
             <portSpacing port="sink_result 1" spacing="0"/>
           </process>
           <description align="center" color="transparent" colored="false" width="126">Optimize C and Gamma of a radial SVM using optimize by Grid</description>
         </operator>
         <connect from_op="Retrieve Iris" from_port="output" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
         <connect from_op="Optimize Parameters (Grid)" from_port="performance" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
    </process>
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • DocMusherDocMusher Member Posts: 243   Unicorn
    Dear Martin,
    as usual, a clear and perfect solution.
    Thanks, Have a nice weekend
    Cheers
    Sven
  • DocMusherDocMusher Member Posts: 243   Unicorn
    Hi,
    I suppose the following chart is impossible to generate in RM?

    http://scikit-learn.org/0.10/_images/plot_svm_parameters_selection_1.png
    Cheers
    Sven
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,049  RM Data Scientist
    Hi,

    a similar chart is possbile. Marius did this once and told me how to do it. Sadly i forgot it :/

    Cheers,
    Martin
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,049  RM Data Scientist
    Hi Sven!
    If you need this plot for your paper, i might programm a quick python snippet for it. Even though it might feel like code club: https://www.youtube.com/watch?v=a6FhAQsjRuk
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • DocMusherDocMusher Member Posts: 243   Unicorn
    Dear Martin,
    This wound be very kind of you. I looked at: https://plot.ly/python/heatmaps/ but had to admit that I am only a anesthesiologist which means a human being with less brain volume compared to real data scientists!
    Cheers
    Sven 
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,049  RM Data Scientist
    Hi,

    attached is a process doing it on iris. I do not know how to get rid of the blue background color... Might need to think about it. It is a bit strange for me to write this kind of code again.

    Be sure to have the python extension and matplotlib installed. I personally use Anaconda which is available for windows and mac.
    Please be careful running this on server, because it opens a dialogue and the process ends only if you close the dialogue

    Cheers,
    Martin

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.4.000">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="6.4.000" expanded="true" name="Process">
       <process expanded="true">
         <operator activated="true" class="retrieve" compatibility="6.4.000" expanded="true" height="60" name="Retrieve Iris" width="90" x="112" y="30">
           <parameter key="repository_entry" value="//Samples/data/Iris"/>
         </operator>
         <operator activated="true" class="optimize_parameters_grid" compatibility="6.4.000" expanded="true" height="112" name="Optimize Parameters (Grid)" width="90" x="246" y="30">
           <list key="parameters">
             <parameter key="SVM.C" value="[1e-3;10;4;logarithmic]"/>
             <parameter key="SVM.gamma" value="[1e-3;10;4;logarithmic]"/>
           </list>
           <process expanded="true">
             <operator activated="true" class="x_validation" compatibility="5.1.002" expanded="true" height="112" name="Validation" width="90" x="45" y="30">
               <parameter key="sampling_type" value="2"/>
               <process expanded="true">
                 <operator activated="true" class="support_vector_machine_libsvm" compatibility="6.4.000" expanded="true" height="76" name="SVM" width="90" x="112" y="30">
                   <parameter key="gamma" value="10.0"/>
                   <parameter key="C" value="10.0"/>
                   <list key="class_weights"/>
                 </operator>
                 <connect from_port="training" to_op="SVM" to_port="training set"/>
                 <connect from_op="SVM" from_port="model" to_port="model"/>
                 <portSpacing port="source_training" spacing="0"/>
                 <portSpacing port="sink_model" spacing="0"/>
                 <portSpacing port="sink_through 1" spacing="0"/>
               </process>
               <process expanded="true">
                 <operator activated="true" class="apply_model" compatibility="6.4.000" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
                   <list key="application_parameters"/>
                 </operator>
                 <operator activated="true" class="performance" compatibility="6.4.000" expanded="true" height="76" name="Performance" width="90" x="179" y="30"/>
                 <connect from_port="model" to_op="Apply Model" to_port="model"/>
                 <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
                 <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
                 <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
                 <portSpacing port="source_model" spacing="0"/>
                 <portSpacing port="source_test set" spacing="0"/>
                 <portSpacing port="source_through 1" spacing="0"/>
                 <portSpacing port="sink_averagable 1" spacing="0"/>
                 <portSpacing port="sink_averagable 2" spacing="0"/>
               </process>
               <description align="center" color="transparent" colored="false" width="126">A cross-validation evaluating a decision tree model.</description>
             </operator>
             <operator activated="true" class="log" compatibility="6.4.000" expanded="true" height="76" name="Log" width="90" x="179" y="75">
               <list key="log">
                 <parameter key="C" value="operator.SVM.parameter.C"/>
                 <parameter key="gamma" value="operator.SVM.parameter.gamma"/>
                 <parameter key="Performance" value="operator.Validation.value.performance"/>
               </list>
             </operator>
             <connect from_port="input 1" to_op="Validation" to_port="training"/>
             <connect from_op="Validation" from_port="averagable 1" to_op="Log" to_port="through 1"/>
             <connect from_op="Log" from_port="through 1" to_port="performance"/>
             <portSpacing port="source_input 1" spacing="0"/>
             <portSpacing port="source_input 2" spacing="0"/>
             <portSpacing port="sink_performance" spacing="0"/>
             <portSpacing port="sink_result 1" spacing="0"/>
             <portSpacing port="sink_result 2" spacing="0"/>
           </process>
           <description align="center" color="transparent" colored="false" width="126">Optimize C and Gamma of a radial SVM using optimize by Grid</description>
         </operator>
         <operator activated="true" class="log_to_data" compatibility="6.4.000" expanded="true" height="94" name="Log to Data" width="90" x="380" y="30">
           <parameter key="log_name" value="Log"/>
         </operator>
         <operator activated="true" class="python_scripting:execute_python" compatibility="6.4.000" expanded="true" height="76" name="Execute Python" width="90" x="514" y="30">
           <parameter key="script" value="import pandas as pd&#10;import matplotlib.pyplot as plt&#10;import numpy as np&#10;&#10;def rm_main(data):&#10;&#10;&#10;    y =  np.log10(data.iloc[:][&quot;C&quot;])&#10;    x = np.log10(data.iloc[:][&quot;gamma&quot;])&#10;    z = data.iloc[:][&quot;Performance&quot;]&#10;&#10;     &#10;    plt.title(&quot;Radial SVM Performance&quot;,fontsize=25)&#10;    print x,y,z&#10;    plt.hist2d(x,y,weights=z,vmin=z.min(),vmax=z.max())&#10;    plt.colorbar()&#10;    plt.xlabel(&quot;log10(C)&quot;)&#10;    plt.ylabel(&quot;log10(gamma)&quot;)&#10;&#10;    plt.show()&#10;"/>
         </operator>
         <connect from_op="Retrieve Iris" from_port="output" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
         <connect from_op="Optimize Parameters (Grid)" from_port="result 1" to_op="Log to Data" to_port="through 1"/>
         <connect from_op="Log to Data" from_port="exampleSet" to_op="Execute Python" to_port="input 1"/>
         <connect from_op="Execute Python" from_port="output 1" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
    </process>
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • DocMusherDocMusher Member Posts: 243   Unicorn
    Hi,
    Martin,
    Works perfect with Iris. Now try my own log file.
    You made my day!
    Thanks
    Sven :)
  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 563   Unicorn
    Nice!

    You can also save yourself the Python by using Advanced Charts.

    Set the Domain to : C
    Set the Numerical Axis to : gamma
    Set the Color dimension to : Performance

    Click on domain dimension & set it to Logarithmic
    Click on numerical axis & set it to Logarithmic

    This will now give you little coloured dots, but really bigger dots would be nicer.
    Click on Numerical Axis -> Series: gamma & then click on Format -> Configure
    In this setting you can change from circle to square and then update the size of the square to whatever size looks good.
  • DocMusherDocMusher Member Posts: 243   Unicorn
    Hi,
    I have a look. Nice to give alternatives!
    Have nice day.
    Sven
  • DocMusherDocMusher Member Posts: 243   Unicorn
    Hi,
    Without the background if you define the number of bins : for iris number of bins = 5 plt.hist2d(x,y,bins=5,weights=z,vmin=z.min(),vmax=z.max())
    Information from: http://matplotlib.org/api/pyplot_api.html

    Cheers
    Sven
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,049  RM Data Scientist
    So much regarding the

    I am only a anesthesiologist
    thing.

    Well done, thanks for sharing. Since you can pass macros to python, you might automate it by counting the number of different C/gamma values :)
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,049  RM Data Scientist
    Hi,

    the process attached extracts the xbins and ybins automatically from the process. Might be useful :)
    The result is this:
    image

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.4.000">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="6.4.000" expanded="true" name="Process">
       <process expanded="true">
         <operator activated="true" class="retrieve" compatibility="6.4.000" expanded="true" height="60" name="Retrieve Iris" width="90" x="112" y="30">
           <parameter key="repository_entry" value="//Samples/data/Iris"/>
         </operator>
         <operator activated="true" class="optimize_parameters_grid" compatibility="6.4.000" expanded="true" height="112" name="Optimize Parameters (Grid)" width="90" x="246" y="30">
           <list key="parameters">
             <parameter key="SVM.C" value="[1e-3;10;20;logarithmic]"/>
             <parameter key="SVM.gamma" value="[1e-3;10;20;logarithmic]"/>
           </list>
           <process expanded="true">
             <operator activated="true" class="x_validation" compatibility="5.1.002" expanded="true" height="112" name="Validation" width="90" x="45" y="30">
               <parameter key="sampling_type" value="2"/>
               <process expanded="true">
                 <operator activated="true" class="support_vector_machine_libsvm" compatibility="6.4.000" expanded="true" height="76" name="SVM" width="90" x="112" y="30">
                   <parameter key="gamma" value="10.0"/>
                   <parameter key="C" value="10.0"/>
                   <list key="class_weights"/>
                 </operator>
                 <connect from_port="training" to_op="SVM" to_port="training set"/>
                 <connect from_op="SVM" from_port="model" to_port="model"/>
                 <portSpacing port="source_training" spacing="0"/>
                 <portSpacing port="sink_model" spacing="0"/>
                 <portSpacing port="sink_through 1" spacing="0"/>
               </process>
               <process expanded="true">
                 <operator activated="true" class="apply_model" compatibility="6.4.000" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
                   <list key="application_parameters"/>
                 </operator>
                 <operator activated="true" class="performance" compatibility="6.4.000" expanded="true" height="76" name="Performance" width="90" x="179" y="30"/>
                 <connect from_port="model" to_op="Apply Model" to_port="model"/>
                 <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
                 <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
                 <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
                 <portSpacing port="source_model" spacing="0"/>
                 <portSpacing port="source_test set" spacing="0"/>
                 <portSpacing port="source_through 1" spacing="0"/>
                 <portSpacing port="sink_averagable 1" spacing="0"/>
                 <portSpacing port="sink_averagable 2" spacing="0"/>
               </process>
               <description align="center" color="transparent" colored="false" width="126">A cross-validation evaluating a decision tree model.</description>
             </operator>
             <operator activated="true" class="log" compatibility="6.4.000" expanded="true" height="76" name="Log" width="90" x="179" y="75">
               <list key="log">
                 <parameter key="C" value="operator.SVM.parameter.C"/>
                 <parameter key="gamma" value="operator.SVM.parameter.gamma"/>
                 <parameter key="Performance" value="operator.Validation.value.performance"/>
               </list>
             </operator>
             <connect from_port="input 1" to_op="Validation" to_port="training"/>
             <connect from_op="Validation" from_port="averagable 1" to_op="Log" to_port="through 1"/>
             <connect from_op="Log" from_port="through 1" to_port="performance"/>
             <portSpacing port="source_input 1" spacing="0"/>
             <portSpacing port="source_input 2" spacing="0"/>
             <portSpacing port="sink_performance" spacing="0"/>
             <portSpacing port="sink_result 1" spacing="0"/>
             <portSpacing port="sink_result 2" spacing="0"/>
           </process>
           <description align="center" color="transparent" colored="false" width="126">Optimize C and Gamma of a radial SVM using optimize by Grid</description>
         </operator>
         <operator activated="true" class="log_to_data" compatibility="6.4.000" expanded="true" height="94" name="Log to Data" width="90" x="380" y="30">
           <parameter key="log_name" value="Log"/>
         </operator>
         <operator activated="true" class="multiply" compatibility="6.4.000" expanded="true" height="112" name="Multiply" width="90" x="514" y="30"/>
         <operator activated="true" class="aggregate" compatibility="6.4.000" expanded="true" height="76" name="Aggregate (2)" width="90" x="648" y="165">
           <list key="aggregation_attributes">
             <parameter key="gamma" value="average"/>
           </list>
           <parameter key="group_by_attributes" value="gamma"/>
         </operator>
         <operator activated="true" class="extract_macro" compatibility="6.4.000" expanded="true" height="60" name="Extract Macro (2)" width="90" x="782" y="165">
           <parameter key="macro" value="xbins"/>
           <list key="additional_macros"/>
         </operator>
         <operator activated="true" class="aggregate" compatibility="6.4.000" expanded="true" height="76" name="Aggregate" width="90" x="648" y="75">
           <list key="aggregation_attributes">
             <parameter key="C" value="average"/>
           </list>
           <parameter key="group_by_attributes" value="C"/>
         </operator>
         <operator activated="true" class="extract_macro" compatibility="6.4.000" expanded="true" height="60" name="Extract Macro" width="90" x="782" y="75">
           <parameter key="macro" value="ybins"/>
           <list key="additional_macros"/>
         </operator>
         <operator activated="true" class="python_scripting:execute_python" compatibility="6.4.000" expanded="true" height="76" name="Execute Python" width="90" x="916" y="30">
           <parameter key="script" value="import pandas as pd&#10;import matplotlib.pyplot as plt&#10;import numpy as np&#10;&#10;def rm_main(data):&#10;&#10;&#10;    y =  np.log10(data.iloc[:][&quot;C&quot;])&#10;    x = np.log10(data.iloc[:][&quot;gamma&quot;])&#10;    z = data.iloc[:][&quot;Performance&quot;]&#10;    xbins = %{xbins} # From process&#10;    ybins = %{ybins} # From process&#10;&#10;    plt.title(&quot;Radial SVM Performance&quot;,fontsize=25)&#10;    print x,y,z&#10;    hist, xbins, ybins = np.histogram2d(x,y,weights=z,bins=[xbins,ybins])&#10;&#10;    #choose either none or gaussian for interpolation&#10;    plt.imshow(hist.T,interpolation=&quot;gaussian&quot;, origin='lower')&#10;    plt.colorbar()&#10;    plt.xlabel(&quot;log10(C)&quot;)&#10;    plt.ylabel(&quot;log10(gamma)&quot;)&#10;    plt.show()"/>
         </operator>
         <connect from_op="Retrieve Iris" from_port="output" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
         <connect from_op="Optimize Parameters (Grid)" from_port="result 1" to_op="Log to Data" to_port="through 1"/>
         <connect from_op="Log to Data" from_port="exampleSet" to_op="Multiply" to_port="input"/>
         <connect from_op="Multiply" from_port="output 1" to_op="Execute Python" to_port="input 1"/>
         <connect from_op="Multiply" from_port="output 2" to_op="Aggregate" to_port="example set input"/>
         <connect from_op="Multiply" from_port="output 3" to_op="Aggregate (2)" to_port="example set input"/>
         <connect from_op="Aggregate (2)" from_port="example set output" to_op="Extract Macro (2)" to_port="example set"/>
         <connect from_op="Aggregate" from_port="example set output" to_op="Extract Macro" to_port="example set"/>
         <connect from_op="Execute Python" from_port="output 1" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
    </process>
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • DocMusherDocMusher Member Posts: 243   Unicorn
    Hi,
    Thanks Martin, worked excellent for me!
    Cheers

    Sven
  • DocMusherDocMusher Member Posts: 243   Unicorn
    Hi,
    The axis in this chart might be the number of bins, not the log10 c or log 10 gamma?
    I provide a link with the 2 charts I get using two processes provided by Martin.
    Although I prefer the the smoothing of colors, I think the values are not the log10 c and log10 gamma values.
    Any suggestions
    Cheers
    Sven
    https://www.dropbox.com/s/knrrummnwztsxlu/SVMCgammacompare.docx?dl=0
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,049  RM Data Scientist
    Oh,

    you are completly right. Sorry for that. The python script below exports the correct min and max values. For a scientific publication i would not use the gaussian interpolation. It simply adds information which might not be there.

    Cheers,
    Martin

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.4.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="6.4.000" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="retrieve" compatibility="6.4.000" expanded="true" height="60" name="Retrieve Iris" width="90" x="112" y="30">
            <parameter key="repository_entry" value="//Samples/data/Iris"/>
          </operator>
          <operator activated="true" class="optimize_parameters_grid" compatibility="6.4.000" expanded="true" height="112" name="Optimize Parameters (Grid)" width="90" x="246" y="30">
            <list key="parameters">
              <parameter key="SVM.C" value="[1e-3;10;20;logarithmic]"/>
              <parameter key="SVM.gamma" value="[1e-3;10;20;logarithmic]"/>
            </list>
            <process expanded="true">
              <operator activated="true" class="x_validation" compatibility="5.1.002" expanded="true" height="112" name="Validation" width="90" x="45" y="30">
                <parameter key="sampling_type" value="2"/>
                <process expanded="true">
                  <operator activated="true" class="support_vector_machine_libsvm" compatibility="6.4.000" expanded="true" height="76" name="SVM" width="90" x="112" y="30">
                    <parameter key="gamma" value="10.0"/>
                    <parameter key="C" value="10.0"/>
                    <list key="class_weights"/>
                  </operator>
                  <connect from_port="training" to_op="SVM" to_port="training set"/>
                  <connect from_op="SVM" from_port="model" to_port="model"/>
                  <portSpacing port="source_training" spacing="0"/>
                  <portSpacing port="sink_model" spacing="0"/>
                  <portSpacing port="sink_through 1" spacing="0"/>
                </process>
                <process expanded="true">
                  <operator activated="true" class="apply_model" compatibility="6.4.000" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
                    <list key="application_parameters"/>
                  </operator>
                  <operator activated="true" class="performance" compatibility="6.4.000" expanded="true" height="76" name="Performance" width="90" x="179" y="30"/>
                  <connect from_port="model" to_op="Apply Model" to_port="model"/>
                  <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
                  <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
                  <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
                  <portSpacing port="source_model" spacing="0"/>
                  <portSpacing port="source_test set" spacing="0"/>
                  <portSpacing port="source_through 1" spacing="0"/>
                  <portSpacing port="sink_averagable 1" spacing="0"/>
                  <portSpacing port="sink_averagable 2" spacing="0"/>
                </process>
                <description align="center" color="transparent" colored="false" width="126">A cross-validation evaluating a decision tree model.</description>
              </operator>
              <operator activated="true" class="log" compatibility="6.4.000" expanded="true" height="76" name="Log" width="90" x="179" y="75">
                <list key="log">
                  <parameter key="C" value="operator.SVM.parameter.C"/>
                  <parameter key="gamma" value="operator.SVM.parameter.gamma"/>
                  <parameter key="Performance" value="operator.Validation.value.performance"/>
                </list>
              </operator>
              <connect from_port="input 1" to_op="Validation" to_port="training"/>
              <connect from_op="Validation" from_port="averagable 1" to_op="Log" to_port="through 1"/>
              <connect from_op="Log" from_port="through 1" to_port="performance"/>
              <portSpacing port="source_input 1" spacing="0"/>
              <portSpacing port="source_input 2" spacing="0"/>
              <portSpacing port="sink_performance" spacing="0"/>
              <portSpacing port="sink_result 1" spacing="0"/>
              <portSpacing port="sink_result 2" spacing="0"/>
            </process>
            <description align="center" color="transparent" colored="false" width="126">Optimize C and Gamma of a radial SVM using optimize by Grid</description>
          </operator>
          <operator activated="true" class="log_to_data" compatibility="6.4.000" expanded="true" height="94" name="Log to Data" width="90" x="380" y="30">
            <parameter key="log_name" value="Log"/>
          </operator>
          <operator activated="true" class="multiply" compatibility="6.4.000" expanded="true" height="112" name="Multiply" width="90" x="514" y="30"/>
          <operator activated="true" class="aggregate" compatibility="6.4.000" expanded="true" height="76" name="Aggregate (2)" width="90" x="648" y="165">
            <list key="aggregation_attributes">
              <parameter key="gamma" value="average"/>
            </list>
            <parameter key="group_by_attributes" value="gamma"/>
          </operator>
          <operator activated="true" class="extract_macro" compatibility="6.4.000" expanded="true" height="60" name="Extract Macro (2)" width="90" x="782" y="165">
            <parameter key="macro" value="xbins"/>
            <list key="additional_macros"/>
          </operator>
          <operator activated="true" class="aggregate" compatibility="6.4.000" expanded="true" height="76" name="Aggregate" width="90" x="648" y="75">
            <list key="aggregation_attributes">
              <parameter key="C" value="average"/>
            </list>
            <parameter key="group_by_attributes" value="C"/>
          </operator>
          <operator activated="true" class="extract_macro" compatibility="6.4.000" expanded="true" height="60" name="Extract Macro" width="90" x="782" y="75">
            <parameter key="macro" value="ybins"/>
            <list key="additional_macros"/>
          </operator>
          <operator activated="true" class="python_scripting:execute_python" compatibility="6.4.000" expanded="true" height="76" name="Execute Python" width="90" x="916" y="30">
            <parameter key="script" value="import pandas as pd&#10;import matplotlib.pyplot as plt&#10;import numpy as np&#10;&#10;def rm_main(data):&#10;&#10;&#10;    y =  np.log10(data.iloc[:][&quot;C&quot;])&#10;    x = np.log10(data.iloc[:][&quot;gamma&quot;])&#10;    z = data.iloc[:][&quot;Performance&quot;]&#10;    xbins = %{xbins} # From process&#10;    ybins = %{ybins} # From process&#10;&#10;    plt.title(&quot;Radial SVM Performance&quot;,fontsize=25)&#10;    hist, xbins, ybins = np.histogram2d(x,y,weights=z,bins=[xbins,ybins])&#10;    #choose either none or gaussian for interpolation &#10;    plt.imshow( hist.T,&#10;                interpolation=&quot;gaussian&quot;, &#10;                origin='lower',&#10;                extent=[xbins.min(),xbins.max(),ybins.min(),ybins.max()])&#10;    plt.colorbar()&#10;    plt.xlabel(&quot;log10(C)&quot;)&#10;    plt.ylabel(&quot;log10(gamma)&quot;)&#10;    plt.show()"/>
          </operator>
          <connect from_op="Retrieve Iris" from_port="output" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
          <connect from_op="Optimize Parameters (Grid)" from_port="result 1" to_op="Log to Data" to_port="through 1"/>
          <connect from_op="Log to Data" from_port="exampleSet" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Execute Python" to_port="input 1"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Aggregate" to_port="example set input"/>
          <connect from_op="Multiply" from_port="output 3" to_op="Aggregate (2)" to_port="example set input"/>
          <connect from_op="Aggregate (2)" from_port="example set output" to_op="Extract Macro (2)" to_port="example set"/>
          <connect from_op="Aggregate" from_port="example set output" to_op="Extract Macro" to_port="example set"/>
          <connect from_op="Execute Python" from_port="output 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.