The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
[SOLVED] normailze rows and columns by the same ratio
DerGaertner
Member Posts: 3 Contributor I
Hello,
i want to normalze my table (range transforamtion), but there are different zero values. Before transformation there are only zeros in the first row, after tranformation i realized that there are values like:
0.030 ; 0.028 ; 0.031
Of course there are some negative values in the source table and so the zero isnt the zero anymore. I want that for each row and each column the value transformation is bijective.
Im able to write some groovy script to fix this, but i hope there is another way to do this.
Thanks for help!
edit:
Maybe this example demonstrate my problem.
i want to normalze my table (range transforamtion), but there are different zero values. Before transformation there are only zeros in the first row, after tranformation i realized that there are values like:
0.030 ; 0.028 ; 0.031
Of course there are some negative values in the source table and so the zero isnt the zero anymore. I want that for each row and each column the value transformation is bijective.
Im able to write some groovy script to fix this, but i hope there is another way to do this.
Thanks for help!
edit:
Maybe this example demonstrate my problem.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.013">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.013" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="generate_data" compatibility="5.3.013" expanded="true" height="60" name="Generate Data" width="90" x="112" y="75"/>
<operator activated="true" class="execute_script" compatibility="5.3.013" expanded="true" height="76" name="Execute Script" width="90" x="246" y="75">
<parameter key="script" value="ExampleSet exampleSet = input[0]; Attributes attributes = exampleSet.getAttributes(); int count = 0; Attribute att1 = attributes.get("att1"); Attribute att2 = attributes.get("att2"); Attribute att3 = attributes.get("att3"); Attribute att4 = attributes.get("att4"); Attribute att5 = attributes.get("att5"); exampleSet.getExample(0).setValue(att1, count); exampleSet.getExample(0).setValue(att2, count); exampleSet.getExample(0).setValue(att3, count); exampleSet.getExample(0).setValue(att4, count); exampleSet.getExample(0).setValue(att5, count); return exampleSet;"/>
</operator>
<operator activated="true" breakpoints="before" class="normalize" compatibility="5.3.013" expanded="true" height="94" name="Normalize" width="90" x="380" y="75">
<parameter key="method" value="range transformation"/>
</operator>
<connect from_op="Generate Data" from_port="output" to_op="Execute Script" to_port="input 1"/>
<connect from_op="Execute Script" from_port="output 1" to_op="Normalize" to_port="example set input"/>
<connect from_op="Normalize" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
0
Answers
what exactly do you mean by "different zeros"? Applying the range transformation will of course shift your zero to something near by 0.5 since all original values are distributed in a range from approx. -10 to 10. For each attribute normalization parameters are individually adapted due to their very own distribution. If you want to scale all of them using the same function you may have a look at operators like "Generate Attributes" where you can define a custom function and apply it to different attributes.
Cheers,
Helge
This is my Groovy-Code which does exactly what i need. For this "max" and "min" are the gloabal maximum and minimum. "Generate Attribute" could do the same, but i had to do this with every Attribute and my table has 2500 of them.
Thanks
you can even use the standard normalization. Creative misusage of the pivoting functionality can do the trick: Cheers,
Helge