The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.

Using dynamic macro variables based on column content

denis_wolfrathdenis_wolfrath Member Posts: 4 Contributor I
edited November 2018 in Help

Dear all,

I'd like to know if there is a possibility to use macros by refering to them via column content. The selection, which predefined macro shall be used, is depending on the content of a specific column.

 

A simplified example

Input table

country base_value result_value
GER 10 ?
USA 5 ?

 

Available macro variables

m_multiplier_GER: 2

m_multiplier_USA: 3

 

I want to calculate result_value by multiplying the base_value with the value of the macro matching the country in its very own name.

 

Expected result

country base_value result_value
GER 10 20
USA 5 15

 

In an "Generate Attributes" operator, I'd need something like [result_valuebase_value * %{m_multiplier_country@}] where "@" represents some sort of escape character to be able to include the column content in the macro name.

 

My current solution is to put the macro information in a table and merge it via the country column. However this produces a lot of duplicated information in my table, which I have to carry through my process and remove afterwards. The process contains of multiple steps that each need different macro information each.<\p>

 

I hope to find a more elegant solution that allows me to use the macros directly.

 

Thanks for the help

 Denis Wolfrath

Best Answer

  • lionelderkrikorlionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Solution Accepted

    Hi @denis_wolfrath,

     

    To improve my knowledge on macros and as a personnal challenge, here the process

    using macros to perform your task (thanks to the ressource of @mschmitz  How to Use Macros). It's far-fetched but it works :

    <?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="read_excel" compatibility="8.0.001" expanded="true" height="68" name="Read Excel" width="90" x="45" y="34">
    <parameter key="excel_file" value="C:\Users\Lionel\Documents\Formations_DataScience\Rapidminer\Tests_Rapidminer\Dynamic_Macros.xlsx"/>
    <parameter key="imported_cell_range" value="A1:C3"/>
    <parameter key="first_row_as_names" value="false"/>
    <list key="annotations">
    <parameter key="0" value="Name"/>
    </list>
    <list key="data_set_meta_data_information">
    <parameter key="0" value="Country.true.polynominal.attribute"/>
    <parameter key="1" value="base_value.true.integer.attribute"/>
    <parameter key="2" value="result value.true.attribute_value.attribute"/>
    </list>
    </operator>
    <operator activated="true" class="set_macros" compatibility="8.0.001" expanded="true" height="82" name="Set Macros (2)" width="90" x="179" y="34">
    <list key="macros">
    <parameter key="country_macro" value="Country"/>
    <parameter key="m_multiplier_GER" value="2"/>
    <parameter key="m_multiplier_USA" value="3"/>
    </list>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="8.0.001" expanded="true" height="82" name="Select Attributes (2)" width="90" x="313" y="34">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attribute" value="base_value"/>
    <parameter key="attributes" value="Country|base_value"/>
    </operator>
    <operator activated="true" class="generate_attributes" compatibility="8.0.001" expanded="true" height="82" name="Generate Attributes (6)" width="90" x="447" y="34">
    <list key="function_descriptions">
    <parameter key="result_value" value="eval(concat(str(base_value),&quot;*&quot;,eval((concat(&quot;%{&quot;,concat(&quot;m_multiplier_&quot;,eval(%{country_macro})),&quot;}&quot;)),NOMINAL)),REAL)"/>
    </list>
    </operator>
    <connect from_op="Read Excel" from_port="output" to_op="Set Macros (2)" to_port="through 1"/>
    <connect from_op="Set Macros (2)" from_port="through 1" to_op="Select Attributes (2)" to_port="example set input"/>
    <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Generate Attributes (6)" to_port="example set input"/>
    <connect from_op="Generate Attributes (6)" from_port="example set output" to_port="result 3"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    <portSpacing port="sink_result 4" spacing="0"/>
    </process>
    </operator>
    </process>

    Here your fictive example set : 

    https://drive.google.com/open?id=1KdToSAP1j7Q5lwLTlLvhBRLgitQ2BUh_

     

     Best regards, 

     

    Lionel

     

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,525 RM Data Scientist

    Hi Denis,

     

    my way of solving it would be a table with the multiplies. Like

     

    Country   multiplier

    USA          3

    Germany 5

    ...

     

    you can join this table on your data and then simply do the calculations without any macros needed.

     

    Best,

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • denis_wolfrathdenis_wolfrath Member Posts: 4 Contributor I

    @lionelderkrikor

    many thanks, your way solves my problem perfectly.

     

    Using concat to build the macro statements during execution opens even more posibilities.

     

    Kind regards

     Denis Wolfrath

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,525 RM Data Scientist

    Hi @denis_wolfrath,

     

    this indeed gives you even more flexibility if you want to. It even allows you to use different functions for different values etc. Note that this might slow down the execution, since some of our internal optimizers cannot work on it. Since Generate Attributes is usually running only seconds it should not make a huge difference though.

     

    Best,

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.