The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
How do extract optimal value of a parameter (so I can log it)
I have a simple process that looks remarkably similar to the one I'm posting below.
Basically, I want to find the optimal cut-off point for each variable and then log its name, its performance AND the cutoff point. I was successful in the first two but not in the last one.
How can I get the optimal parameter from "Optimize Parameters (Grid)" so I can log it?
Process Below:
<?xml version="1.0" encoding="UTF-8"?><process version="7.5.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.5.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="7.5.001" expanded="true" height="68" name="Retrieve Iris" width="90" x="112" y="238">
<parameter key="repository_entry" value="//Samples/data/Iris"/>
</operator>
<operator activated="true" class="filter_examples" compatibility="7.5.001" expanded="true" height="103" name="Filter Examples" width="90" x="313" y="238">
<parameter key="invert_filter" value="true"/>
<list key="filters_list">
<parameter key="filters_entry_key" value="label.equals.Iris-setosa"/>
</list>
</operator>
<operator activated="true" class="concurrency:loop_attributes" compatibility="7.5.001" expanded="true" height="103" name="Loop Attributes" width="90" x="514" y="238">
<process expanded="true">
<operator activated="true" class="select_attributes" compatibility="7.5.001" expanded="true" height="82" name="Select Attributes" width="90" x="112" y="34">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="%{loop_attribute}"/>
</operator>
<operator activated="true" class="optimize_parameters_grid" compatibility="7.5.001" expanded="true" height="103" name="Optimize Parameters (Grid)" width="90" x="380" y="34">
<list key="parameters">
<parameter key="Set Macro.value" value="[0.0;8;8;linear]"/>
</list>
<process expanded="true">
<operator activated="true" class="set_macro" compatibility="7.5.001" expanded="true" height="82" name="Set Macro" width="90" x="179" y="85">
<parameter key="macro" value="cutoff"/>
<parameter key="value" value="5"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="7.5.001" expanded="true" height="82" name="Generate Attributes" width="90" x="380" y="85">
<list key="function_descriptions">
<parameter key="prediction" value="if(eval(%{loop_attribute}) > eval(%{cutoff}),"Iris-virginica","Iris-versicolor")"/>
</list>
</operator>
<operator activated="true" class="set_role" compatibility="7.5.001" expanded="true" height="82" name="Set Role" width="90" x="581" y="85">
<parameter key="attribute_name" value="prediction"/>
<parameter key="target_role" value="prediction"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="performance" compatibility="7.5.001" expanded="true" height="82" name="Performance" width="90" x="782" y="85"/>
<connect from_port="input 1" to_op="Set Macro" to_port="through 1"/>
<connect from_op="Set Macro" from_port="through 1" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="performance"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
</process>
</operator>
<operator activated="true" class="log" compatibility="7.5.001" expanded="true" height="103" name="Log" width="90" x="715" y="34">
<list key="log">
<parameter key="variable" value="operator.Loop Attributes.value.attribute_name"/>
<parameter key="performance" value="operator.Optimize Parameters (Grid).value.performance"/>
<parameter key="cutoff" value="operator.Optimize Parameters (Grid).parameter.parameters"/>
</list>
</operator>
<connect from_port="input 1" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
<connect from_op="Optimize Parameters (Grid)" from_port="performance" to_op="Log" to_port="through 1"/>
<connect from_op="Optimize Parameters (Grid)" from_port="parameter" to_op="Log" to_port="through 2"/>
<connect from_op="Log" from_port="through 1" to_port="output 1"/>
<connect from_op="Log" from_port="through 2" to_port="output 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
<portSpacing port="sink_output 3" spacing="0"/>
</process>
</operator>
<connect from_op="Retrieve Iris" from_port="output" to_op="Filter Examples" to_port="example set input"/>
<connect from_op="Filter Examples" from_port="example set output" to_op="Loop Attributes" to_port="input 1"/>
<connect from_op="Loop Attributes" from_port="output 1" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Thanks in advance for any help,
CarlosQ
0
Best Answer
-
sgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager
so maybe the new "Parameter Set to ExampleSet" operator may be handy here?
<?xml version="1.0" encoding="UTF-8"?><process version="7.5.003">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.5.003" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="7.5.003" expanded="true" height="68" name="Retrieve Iris" width="90" x="45" y="34">
<parameter key="repository_entry" value="//Samples/data/Iris"/>
</operator>
<operator activated="true" class="filter_examples" compatibility="7.5.003" expanded="true" height="103" name="Filter Examples" width="90" x="179" y="34">
<parameter key="invert_filter" value="true"/>
<list key="filters_list">
<parameter key="filters_entry_key" value="label.equals.Iris-setosa"/>
</list>
</operator>
<operator activated="true" class="concurrency:loop_attributes" compatibility="7.5.003" expanded="true" height="103" name="Loop Attributes" width="90" x="313" y="34">
<process expanded="true">
<operator activated="true" class="select_attributes" compatibility="7.5.003" expanded="true" height="82" name="Select Attributes" width="90" x="112" y="34">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="%{loop_attribute}"/>
</operator>
<operator activated="true" class="optimize_parameters_grid" compatibility="7.5.003" expanded="true" height="103" name="Optimize Parameters (Grid)" width="90" x="380" y="34">
<list key="parameters">
<parameter key="Set Macro.value" value="[0.0;8;8;linear]"/>
</list>
<process expanded="true">
<operator activated="true" class="set_macro" compatibility="7.5.003" expanded="true" height="82" name="Set Macro" width="90" x="179" y="85">
<parameter key="macro" value="cutoff"/>
<parameter key="value" value="8.0"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="7.5.003" expanded="true" height="82" name="Generate Attributes" width="90" x="380" y="85">
<list key="function_descriptions">
<parameter key="prediction" value="if(eval(%{loop_attribute}) > eval(%{cutoff}),"Iris-virginica","Iris-versicolor")"/>
</list>
</operator>
<operator activated="true" class="set_role" compatibility="7.5.003" expanded="true" height="82" name="Set Role" width="90" x="581" y="85">
<parameter key="attribute_name" value="prediction"/>
<parameter key="target_role" value="prediction"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="performance" compatibility="7.5.003" expanded="true" height="82" name="Performance" width="90" x="782" y="85"/>
<connect from_port="input 1" to_op="Set Macro" to_port="through 1"/>
<connect from_op="Set Macro" from_port="through 1" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="performance"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
</process>
</operator>
<operator activated="true" class="converters:parameter_set_2_example_set" compatibility="0.3.001" expanded="true" height="103" name="Parameter Set to ExampleSet" width="90" x="514" y="187">
<parameter key="Try to estimate type of parameters" value="true"/>
</operator>
<operator activated="true" class="log" compatibility="7.5.003" expanded="true" height="82" name="Log" width="90" x="715" y="34">
<list key="log">
<parameter key="variable" value="operator.Loop Attributes.value.attribute_name"/>
<parameter key="performance" value="operator.Optimize Parameters (Grid).value.performance"/>
<parameter key="cutoff" value="operator.Optimize Parameters (Grid).parameter.parameters"/>
</list>
</operator>
<connect from_port="input 1" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
<connect from_op="Optimize Parameters (Grid)" from_port="performance" to_op="Log" to_port="through 1"/>
<connect from_op="Optimize Parameters (Grid)" from_port="parameter" to_op="Parameter Set to ExampleSet" to_port="parameters"/>
<connect from_op="Parameter Set to ExampleSet" from_port="exampleSet" to_port="output 2"/>
<connect from_op="Log" from_port="through 1" to_port="output 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="126"/>
<portSpacing port="sink_output 3" spacing="21"/>
</process>
</operator>
<operator activated="true" class="append" compatibility="7.5.003" expanded="true" height="82" name="Append" width="90" x="514" y="136"/>
<connect from_op="Retrieve Iris" from_port="output" to_op="Filter Examples" to_port="example set input"/>
<connect from_op="Filter Examples" from_port="example set output" to_op="Loop Attributes" to_port="input 1"/>
<connect from_op="Loop Attributes" from_port="output 1" to_port="result 1"/>
<connect from_op="Loop Attributes" from_port="output 2" to_op="Append" to_port="example set 1"/>
<connect from_op="Append" from_port="merged set" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>Scott
0
Answers
If you are using the Optimize operator, then the output port labeled "parameter" delivers the actual parameter values that are optimized via a text output panel.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
Thank you Brian. That's what I thought, but it does not work. When I log the parameter port, I don't get the optimal value of the macro I'm optimizing. I get the last column of the table below. I would like the values of 6, 3, 5 and 2 (the optimal values).
Your log for the macro is not set correctly.
Something like this perhaps?
Thanks Thomas. I tried that too. The problem is that it returns the last value of the macro and not the optimal. The grid goes from 0 to 8 in steps of 1. It always returns 8.
Thanks a million Scott. Your code does the trick.
I adapted your code and now I have the table I wanted. It's kind of convoluted (ParameterSet to Example Set + Extract Macro ) but it does the trick (good enough for me).
The process I was asking about in this question was part of a larger process. In the first part I would try different values for a parameter (using Optimize Parameters with Grid) and after finding the optimal I would do a second part. For a while I thought the macro "cutoff" would store the optimal value and not the last one. Should this be the default behavior (optimal and not last)? It took me a while to discover that it does not.
Thanks to all of you (Bryan, Thomas, Scott) for your time and effort.
My final code below: