‎09-06-2017 10:59 AM

In complex processes or projects with several processes, you often require to parametrize them using variables. Process variables in RapidMiner are called Macros. Macros are a powerful asset, which can be used to fully operationalize your analytics processes. Macros store what can be called primitive types. You can also store objects running through a connection. This can typically be done using Remember and Recall operators.

 

How to Set Macros

In general, there are two ways to set macros. The first way is using the context panel, the other is using operators.

The Context Panel

You can activate the panel by going to view->show panel and activating it. A common place to place this panel is next to the Parameter panel. In the context panel you can set new macros, by clicking on the small "+" button.

Macros1.png

If you think about a single process like a programming function, this panel gives you the options to define the arguments of the function.

 

Best Practice: As a best practice we recommend to use a small letter in the beginning macros and than camel case, to identify macros easier.

 

Generating Macros using Operators

 

Besides setting macros in the context panel, you can also set and modify macros. If you search for Macro in the operator tree you can see a few operators handling macros. We will discuss the three most important ones.

Macros2.png

Set Macro sets a macro very similar to the context menu to a constant value.

 

Generate Macro gives you the option to generate a macro with the interface you know from Generate Attributes. Using this operator you are for example able to generate a macro based on the current date (using the date_now() function).

 

Extract Macro extracts a macro from an example set. Often used options are to extract the number of examples of an example set, statistics like an average or a maximum or even single cell values of your example set.

How to Use Macros in General Operators

To use a macro anywhere in your process you can type %{myMacro}, which will be replaced by the current value of the macro. This is a real direct replacement and works in any value field in your process.

Macros3.png

 

 

How to Use Macros in Generate Attributes

In Generate Attributes and Generate Macros you have more options than just the %-Notation. Namely:

 

%-Notation

 

%{myMacro} inserts the current macro value as a string. If you have a string like foo stored in your macro you can do operations like

 

concat(%{myMacro},"bar"}

prefix(%{myMacro,1}

 

and so on. Keep in mind that you always interpret it as a string. If you store a 1 in you macro

 

concat(%{myMacro},"bar")

 

returns you 1bar.  Operations like

 

%{myMacro} + 1

 

do not work.

 

Eval

The eval() evaluates the string of myMacro. If you have a 1 stored in your Macro you can do

 

eval(%{myMacro}+1

 

you get a two.

 

You can also put whole equations into the macro. If you store a sqrt(2) in you macro and calculate

 

eval(%{myMacro})

 

you get back a 1.41....

 

#-Notation

 The #{attribute_macro} notation is in principle a shortcut for writting eval(%{attribute_macro}), which allows you to access the values of a given attribute.
But there are two importantant difference between the two:
* #{} will fail when the macro does not contain a valid attribute name
on the otherhand
* eval(%{attribute_macro}) will evaluate whatever is contained in the macro, which might fail e.g., if the attribute name conatins a "-"

 

The difference between the notations are shown in this process:

 

Spoiler
<?xml version="1.0" encoding="UTF-8"?><process version="7.2.002-SNAPSHOT">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.2.002-SNAPSHOT" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="7.2.002-SNAPSHOT" expanded="true" height="68" name="Labor-Negotiations" width="90" x="45" y="34">
<parameter key="repository_entry" value="//Samples/data/Labor-Negotiations"/>
</operator>
<operator activated="true" class="set_macros" compatibility="7.2.002-SNAPSHOT" expanded="true" height="82" name="Set Macros" width="90" x="179" y="34">
<list key="macros">
<parameter key="attribute_macro" value="working-hours"/>
</list>
</operator>
<operator activated="true" breakpoints="after" class="select_attributes" compatibility="7.2.002-SNAPSHOT" expanded="true" height="82" name="Select Attributes" width="90" x="380" y="34">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="working-hours"/>
</operator>
<operator activated="true" class="multiply" compatibility="7.2.002-SNAPSHOT" expanded="true" height="103" name="Multiply" width="90" x="514" y="34"/>
<operator activated="true" class="handle_exception" compatibility="7.2.002-SNAPSHOT" expanded="true" height="82" name="Handle Exception" width="90" x="715" y="136">
<parameter key="exception_macro" value="execpt"/>
<process expanded="true">
<operator activated="true" class="generate_attributes" compatibility="7.2.002-SNAPSHOT" expanded="true" height="82" name="Generate Attributes (4)" width="90" x="246" y="34">
<list key="function_descriptions">
<parameter key="res1" value="#{attribute_macro} +4"/>
<parameter key="res2" value="eval(%{attribute_macro}) +4"/>
</list>
</operator>
<connect from_port="in 1" to_op="Generate Attributes (4)" to_port="example set input"/>
<connect from_op="Generate Attributes (4)" from_port="example set output" to_port="out 1"/>
<portSpacing port="source_in 1" spacing="0"/>
<portSpacing port="source_in 2" spacing="0"/>
<portSpacing port="sink_out 1" spacing="0"/>
<portSpacing port="sink_out 2" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" class="generate_attributes" compatibility="7.2.002-SNAPSHOT" expanded="true" height="82" name="Generate Attributes (5)" width="90" x="246" y="34">
<list key="function_descriptions">
<parameter key="res2" value="%{execpt}"/>
</list>
</operator>
<connect from_port="in 1" to_op="Generate Attributes (5)" to_port="example set input"/>
<connect from_op="Generate Attributes (5)" from_port="example set output" to_port="out 1"/>
<portSpacing port="source_in 1" spacing="0"/>
<portSpacing port="source_in 2" spacing="0"/>
<portSpacing port="sink_out 1" spacing="0"/>
<portSpacing port="sink_out 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="generate_attributes" compatibility="7.2.002-SNAPSHOT" expanded="true" height="82" name="Generate Attributes (3)" width="90" x="715" y="34">
<list key="function_descriptions">
<parameter key="res1" value="#{attribute_macro} +4"/>
</list>
</operator>
<connect from_op="Labor-Negotiations" from_port="output" to_op="Set Macros" to_port="through 1"/>
<connect from_op="Set Macros" from_port="through 1" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_op="Generate Attributes (3)" to_port="example set input"/>
<connect from_op="Multiply" from_port="output 2" to_op="Handle Exception" to_port="in 1"/>
<connect from_op="Handle Exception" from_port="out 1" to_port="result 2"/>
<connect from_op="Generate Attributes (3)" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
Add Comment Collap

 

Macros and Execute Process

You can use Macros to parametrize ex

Provided Macros

There are some macros already present which can be used throughout the process:

  • %{process_name}: will be replaced by the name of the process (without path and extension)
  • %{process_file}: will be replaced by the file name of the process (with extension)
  • %{process_path}: will be replaced by the complete absolute path of the process file
  • %{execution_count}: will be replaced by the number of times the current operator was applied.
  • %{operator_name}: will be replaced by the name of the current operator.
  • %{t}: will be replaced by the current time

Advanced Use Cases for Macros

 

 

 

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner