How to transform entire example set?

wanglu2014wanglu2014 Member Posts: 19 Contributor II
edited November 2018 in Help

There is similar problem(http://community.rapidminer.com/t5/Developer-Forum/How-to-use-Log-transform-and-missing-values/m-p/40602). However, new question is how to log transform entire example set, all most all the attribute. Could you kindly provide your suggestion?

Answers

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hello @wanglu2014 - do you mean taking log() of some attribute?  If so, you can just use the function generator with the Generate Attributes operator.

     

    Scott

  • wanglu2014wanglu2014 Member Posts: 19 Contributor II
    Thank for your timely reply. However, should I log(column1),log(column2) ~~ one by one?
  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    ah ok you want to take log() of each attribute.  Then you want to Loop Attributes and take log() of each one.  Like this:

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.6.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="generate_data" compatibility="7.6.001" expanded="true" height="68" name="Generate Data" width="90" x="45" y="85"/>
    <operator activated="true" breakpoints="after" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes" width="90" x="179" y="85">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="label"/>
    <parameter key="invert_selection" value="true"/>
    <parameter key="include_special_attributes" value="true"/>
    </operator>
    <operator activated="true" class="remember" compatibility="7.6.001" expanded="true" height="68" name="Remember" width="90" x="313" y="85">
    <parameter key="name" value="foo"/>
    </operator>
    <operator activated="true" class="concurrency:loop_attributes" compatibility="7.6.001" expanded="true" height="82" name="Loop Attributes" width="90" x="447" y="85">
    <parameter key="enable_parallel_execution" value="false"/>
    <process expanded="true">
    <operator activated="true" class="recall" compatibility="7.6.001" expanded="true" height="68" name="Recall" width="90" x="112" y="136">
    <parameter key="name" value="foo"/>
    </operator>
    <operator activated="true" class="generate_attributes" compatibility="7.6.001" expanded="true" height="82" name="Generate Attributes" width="90" x="246" y="136">
    <list key="function_descriptions">
    <parameter key="%{loop_attribute}" value="log(eval(%{loop_attribute}))"/>
    </list>
    </operator>
    <operator activated="true" class="remember" compatibility="7.6.001" expanded="true" height="68" name="Remember (2)" width="90" x="447" y="136">
    <parameter key="name" value="foo"/>
    </operator>
    <connect from_op="Recall" from_port="result" to_op="Generate Attributes" to_port="example set input"/>
    <connect from_op="Generate Attributes" from_port="example set output" to_op="Remember (2)" to_port="store"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="recall" compatibility="7.6.001" expanded="true" height="68" name="Recall (2)" width="90" x="581" y="85">
    <parameter key="name" value="foo"/>
    </operator>
    <connect from_op="Generate Data" from_port="output" to_op="Select Attributes" to_port="example set input"/>
    <connect from_op="Select Attributes" from_port="example set output" to_op="Remember" to_port="store"/>
    <connect from_op="Remember" from_port="stored" to_op="Loop Attributes" to_port="input 1"/>
    <connect from_op="Recall (2)" from_port="result" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

    Scott

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist

    Hi,

     

    maybe Generate Function Set is a solution.

     

    Best,

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.