I want to loop on couples of attributes and create a new one given by an operation on them

erikapastorellierikapastorelli Member Posts: 1 Learner I
edited December 2018 in Help

Hi everyone! I've a problem on Rapidminer 5.Attached is my dataset.

From the fourth column over i have to generate a new attribute starting from the couple x-1,x-0, y-1,y-0.

For example A2A.MI-1 and A2A.MI-0 have to became a new attribute where I got the quotien between them.

How can i solve? I tried loop on attributed and script but I failed.

Thank you very much


  • Options
    lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn

    Hi @erikapastorelli,


    Here an example of process using Execute Python operator to perform what you want to do : 

    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="read_excel" compatibility="8.1.001" expanded="true" height="68" name="Read Excel" width="90" x="45" y="34">
    <parameter key="excel_file" value="C:\Users\Lionel\Documents\Formations_DataScience\Rapidminer\Tests_Rapidminer\Iterate_Attributes\Iterate_Attributes.xlsx"/>
    <parameter key="imported_cell_range" value="A1:I4"/>
    <parameter key="first_row_as_names" value="false"/>
    <list key="annotations">
    <parameter key="0" value="Name"/>
    <list key="data_set_meta_data_information">
    <parameter key="0" value="label.true.integer.attribute"/>
    <parameter key="1" value="Date.true.date_time.attribute"/>
    <parameter key="2" value="test.true.polynominal.attribute"/>
    <parameter key="3" value="A2A\.MI-1.true.real.attribute"/>
    <parameter key="4" value="A2A\.MI-0.true.real.attribute"/>
    <parameter key="5" value="AGL\.MI-1.true.real.attribute"/>
    <parameter key="6" value="AGL\.MI-0.true.real.attribute"/>
    <parameter key="7" value="ATL\.MI-1.true.real.attribute"/>
    <parameter key="8" value="ATL\.MI-0.true.real.attribute"/>
    <operator activated="true" class="python_scripting:execute_python" compatibility="7.4.000" expanded="true" height="82" name="Execute Python" width="90" x="246" y="34">
    <parameter key="script" value="import pandas as pd&#10;&#10;# rm_main is a mandatory function, &#10;# the number of arguments has to be the number of input ports (can be none)&#10;def rm_main(data):&#10;&#10; for column in range(3,len(data.columns)+1,2):&#10; &#10; data[&quot;Quotient_&quot; + str(column)] = data.apply(lambda row: row.iloc[column] / row.iloc[column+1], axis=1)&#10; &#10;&#10; # connect 1 output port to see the results&#10; return data"/>
    <connect from_op="Read Excel" from_port="output" to_op="Execute Python" to_port="input 1"/>
    <connect from_op="Execute Python" from_port="output 1" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="90"/>
    <portSpacing port="sink_result 2" spacing="0"/>

    Here the Excel file with an extract of your dataset (which I used to create this process) : 



    I hope it helps,






    NB : There is just a problem : after execution of the Python script, the date attribute is set to "?", and i don't know why....

    NB2 : I thing it is possible to improve this Python script, to give more explicit name to the generated attributes.


Sign In or Register to comment.