Options

Passing column names as variables to other subprocess

Karan_KansalKaran_Kansal Member Posts: 9 Contributor II
edited November 2018 in Help
Hi,

I am trying use a process within another process by using the execute process block. For example a process A should be run within the parent process B but the input data in process B is variable and can have different structures (column names, rows, data types etc.). Can I pass the column names as variables to process A so that I need not change the column names everytime I want to use process A within another process. Basically something similar to column mapping in Informatica while using mapplets.


Any ideas on how this can be done (if possible at all).

Thanks

Answers

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,510 RM Data Scientist
    Yes! Of course you can. This is one of the main use cases of excute process.

    You can simply hand over a macro using the macros option of excute process. Then you can simply use this macro as colname.

    ~Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Options
    Karan_KansalKaran_Kansal Member Posts: 9 Contributor II
    Thanks Martin

    I found the macros option in the Execute process block but I am able to use it properly. It gives an error message saying :

    The macro is not defined in the context of the included process.

    Can you help me solve this?

    Below is the XML code:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.4.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="6.4.000" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="retrieve" compatibility="6.4.000" expanded="true" height="60" name="Retrieve HCP Data" width="90" x="45" y="30">
            <parameter key="repository_entry" value="//CCM Suggestion Engine/HCP Data"/>
          </operator>
          <operator activated="true" class="retrieve" compatibility="6.4.000" expanded="true" height="60" name="Retrieve Test data" width="90" x="45" y="165">
            <parameter key="repository_entry" value="//CCM Suggestion Engine/Test data"/>
          </operator>
          <operator activated="true" class="join" compatibility="6.4.000" expanded="true" height="76" name="Join" width="90" x="179" y="75">
            <parameter key="use_id_attribute_as_key" value="false"/>
            <list key="key_attributes">
              <parameter key="Metric ID" value="Metric ID"/>
            </list>
          </operator>
          <operator activated="true" class="rename_by_replacing" compatibility="6.4.000" expanded="true" height="76" name="Rename by Replacing" width="90" x="313" y="165">
            <parameter key="replace_what" value=" "/>
            <parameter key="replace_by" value="_"/>
          </operator>
          <operator activated="true" class="execute_process" compatibility="6.4.000" expanded="true" height="60" name="Execute Process" width="90" x="581" y="165">
            <parameter key="process_location" value="//Reusable Components/Processes/Filter"/>
            <list key="macros">
              <parameter key="value" value="Test_ID"/>
            </list>
          </operator>
          <connect from_op="Retrieve HCP Data" from_port="output" to_op="Join" to_port="left"/>
          <connect from_op="Retrieve Test data" from_port="output" to_op="Join" to_port="right"/>
          <connect from_op="Join" from_port="join" to_op="Rename by Replacing" to_port="example set input"/>
          <connect from_op="Rename by Replacing" from_port="example set output" to_op="Execute Process" to_port="input 1"/>
          <connect from_op="Execute Process" from_port="result 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,510 RM Data Scientist
    Hi,

    are you familiar with the concept of macros? You need to define the macro in the process you execute. You need to activate the context view for this.

    ~Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Options
    Karan_KansalKaran_Kansal Member Posts: 9 Contributor II
    Hi Martin,

    I am new to RapidMiner and still in the process of understanding how it works. In my opinion macros are similar to variables created on the fly that can be used throughout the process. I don't know if my understanding is correct or not. I have already read through the operator documents but couldn't understand much. Can you provide me a link to some reference material for the same?

    Thanks
  • Options
    Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,995 RM Engineering
    Hi,

    let me quickly explain what you experience:

    You checked the "fail for unknown macros" parameter of the "Execute Process" operator. That means it will ensure that the macro names you've given exist in the process context of your selected process. The process context is sort of a global store for various things bound to a given process. This includes macros. So if you have not defined macros for the process in its context, you will get the error you are seeing. Of course, you don't need to define macros in said context - they can be dynamic as well, i.e. only used in operator parameters. But in that case verification does not work, so you need to uncheck the parameter.

    If you do either of these things, it will work as expected.

    Regards,
    Marco
  • Options
    Karan_KansalKaran_Kansal Member Posts: 9 Contributor II

    Thanks Marco,

    Unchecking the "fail for unknown macros" did the trick. Can it be taken further by  dynamically allowing columns to pass to the child process. For example, in the child process I have to sum two columns and create a new attribute but the column names are unknown. I have the input data in the parent process and need to specify which two columns are to be summed in the child process. Can I do this by somehow referencing the columns and passing it to the child process through Execute process block?

  • Options
    Karan_KansalKaran_Kansal Member Posts: 9 Contributor II

    Thanks Martin and Marco for all the help.
Sign In or Register to comment.