The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
"Get and set values using Groovy Script"
Dear All,
I want to learn how to use groovy script.
How to get example values for a particular attribute?
and how to set example values for a particular attribute?
The example by Ingo, pasted below only shows how to iterate over all attributes and all examples.
This is nice, but what if you only want to iterate over all examples of attribute 1,
calculate the sum, and place the result in attribute 2?
How to do this?
Like you start with:
att1 att2
0.1 NaN
0.2 NaN
0.3 NaN
0.4 NaN
And the result will be:
att1 att2
0.1 0.1
0.2 0.3
0.3 0.6
0.4 1.0
Best regards,
Wesesl
ExampleSet exampleSet = operator.getInput(ExampleSet.class);
exampleSet.recalculateAllAttributeStatistics();
for (Attribute attribute : exampleSet.getAttributes()) {
double mean = exampleSet.getStatistics(attribute, Statistics.AVERAGE);
String name = attribute.getName();
for (Example example : exampleSet) {
example[name] = example[name] - mean;
}
}
return exampleSet;
I want to learn how to use groovy script.
How to get example values for a particular attribute?
and how to set example values for a particular attribute?
The example by Ingo, pasted below only shows how to iterate over all attributes and all examples.
This is nice, but what if you only want to iterate over all examples of attribute 1,
calculate the sum, and place the result in attribute 2?
How to do this?
Like you start with:
att1 att2
0.1 NaN
0.2 NaN
0.3 NaN
0.4 NaN
And the result will be:
att1 att2
0.1 0.1
0.2 0.3
0.3 0.6
0.4 1.0
Best regards,
Wesesl
ExampleSet exampleSet = operator.getInput(ExampleSet.class);
exampleSet.recalculateAllAttributeStatistics();
for (Attribute attribute : exampleSet.getAttributes()) {
double mean = exampleSet.getStatistics(attribute, Statistics.AVERAGE);
String name = attribute.getName();
for (Example example : exampleSet) {
example[name] = example[name] - mean;
}
}
return exampleSet;
Tagged:
0
Answers
I guess you don't have the tutorial for the extension development at hand. This starts with an example for using the script operator. Since the other existing documentation is very rare, I built a little process for your example (just containing other values). I added a third attribute to demonstrate the creation of new attributes. The whole process is appended to the end of the post, here comes just the content of the script operator: I hope this makes things a bit clearer.
Best regards
Matthias
Thank you for the help above, very useful.
I have a simple script (shown below) to add 300 to each numerical example. However when I run the script the script outputs correctly but also changes all the previous blocks in my process. Is it possible to make the script operator work in one direction only? i.e. not effect previous results in the process?
Many Thanks,
David
ExampleSet exampleSet2 = input[0];
Attributes attributes = exampleSet2.getAttributes();
Attribute att2 = attributes.get("Midterm Exam");
String name = att2.getName();
for (Example example : exampleSet2) {
example[name] = example[name] + 300;
}
return exampleSet2;
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.011">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.011" expanded="true" name="Process">
<process expanded="true" height="417" width="480">
<operator activated="true" class="read_excel" compatibility="5.1.011" expanded="true" height="60" name="Read Excel" width="90" x="45" y="30">
<parameter key="excel_file" value="\\SBSSRV\Users\david.gibbons\My Documents\MarkA.xls"/>
<parameter key="imported_cell_range" value="A1:B13"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="Midterm Exam.true.integer.attribute"/>
<parameter key="1" value="Final Exam.true.integer.attribute"/>
</list>
</operator>
<operator activated="true" class="multiply" compatibility="5.1.011" expanded="true" height="94" name="Multiply" width="90" x="180" y="30"/>
<operator activated="true" class="execute_script" compatibility="5.1.011" expanded="true" height="76" name="Script" width="90" x="380" y="210">
<parameter key="script" value=" ExampleSet exampleSet2 = input[0]; Attributes attributes = exampleSet2.getAttributes(); Attribute att2 = attributes.get("Midterm Exam"); String name = att2.getName();	 for (Example example : exampleSet2) { 	example[name] = example[name] + 300; } return exampleSet2; "/>
</operator>
<connect from_op="Read Excel" from_port="output" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_port="result 1"/>
<connect from_op="Multiply" from_port="output 2" to_op="Script" to_port="input 1"/>
<connect from_op="Script" from_port="output 1" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
I think that this RapidMiner description which I found:
"The process logic which RapidMiner uses is not "linear", but recursive. We dont apply operators linearly, one after another."
explains my query a bit. Could anybody expand on this description please?
Thanks a lot,
David
You can however create a deep copy of an example set prior to your script with the Materialize operator. That way the changes won't get progagated backwards.
Best, Marius
That explanation makes sense. I will be more careful with the operators. By any chance, is there a way to turn off the default setting (meta-data only passed) for a process?
Could you please let me know if it is possible to select a specific instance attribute?
For example, the first example in the attribute "Midterm Exam".
Attributes attributes = exampleSet.getAttributes();
Attribute exam = attributes.get("Mideterm Exam");
float ff = exam.getElementAt[0];
Many Thanks.