Options

"R extension - how to get started"

mariemarie Member Posts: 3 Contributor I
edited June 2019 in Help

Hey there,

I am really new to the use of RapidMiner and R, and well I did not find anything in the internet on how to get started with the r Extension in RapidMiner that really breaks it down to the basics. So I just tried some very simple things out like the max of a column. 

the script is the following:

rm_main = function(data)
{
max($Temperature)
return(data)
}

and the error message is:

The script yould not be parsed. Please check your R script.

[1] "script.R:5:5: unexpected '$' (....)"

 

Do you know how to solve it?

Or do you have something were one can learn how to get started with the use of the R extension in RapidMiner with just basic knowledge?

Thanks in advance

Marie

 

Ah and here is the xml:

<?xml version="1.0" encoding="UTF-8"?><process version="7.2.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.2.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="7.2.001" expanded="true" height="68" name="Retrieve Golf" width="90" x="45" y="85">
<parameter key="repository_entry" value="//Samples/data/Golf"/>
</operator>
<operator activated="true" class="r_scripting:execute_r" compatibility="7.2.000" expanded="true" height="82" name="Execute R" width="90" x="246" y="85">
<parameter key="script" value="rm_main = function(data)&#10;{&#10; max($Temperature)&#10;}&#10;"/>
</operator>
<connect from_op="Retrieve Golf" from_port="output" to_op="Execute R" to_port="input 1"/>
<connect from_op="Execute R" from_port="output 1" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Tagged:

Best Answer

  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
    Solution Accepted

    Hi Marie,

     

    That means that the resulting R script created an object that RapidMiner can't visualize in the results tab, this is why I added the print statement where you can see the max temp in Log View.  Take a look at the sample tutorial processes loaded for the Execute R operator. Just right click on the operator and click on description. There will be a link for "Jump to Tutorial Processes."

     

    There about 4 different R examples which explain a bit on how you can embed your scripts inside RapidMiner. Good luck!

Answers

  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Hi,

     

    Working with the Execute R operator is pretty straight forward once you understand how RM is delivering the data to the function.  See your sample script modified.

     

    RM is sending it's data to the Execute R script and translates it via the data.tables package. The raw data comes in as "data" via the function(data).  From there I assign it to a golf <- data datafram AND then extract out the column Temperature via output <- max(golf$Temperature)

     

    Then I return the output as an object.

     

    I added a print statement so you can see the results in your LOG view.

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.2.002">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.2.002" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="7.2.002" expanded="true" height="68" name="Retrieve Golf" width="90" x="45" y="85">
    <parameter key="repository_entry" value="//Samples/data/Golf"/>
    </operator>
    <operator activated="true" class="r_scripting:execute_r" compatibility="7.2.000" expanded="true" height="82" name="Execute R" width="90" x="179" y="85">
    <parameter key="script" value="rm_main = function(data)&#10;{&#10;golf &lt;- data&#10;&#10;output &lt;- max(golf$Temperature)&#10;&#10;print(str(output))&#10;&#10;return(output)&#10;}&#10;"/>
    </operator>
    <connect from_op="Retrieve Golf" from_port="output" to_op="Execute R" to_port="input 1"/>
    <connect from_op="Execute R" from_port="output 1" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>
  • Options
    mariemarie Member Posts: 3 Contributor I

    Hey Thomas_Ott,

    thank you very much for your quick reply. 

    It seems very logical what you write. 

    But when I copield the XMl all I get in the Resluts view is:

    File

    Memory buffered file

     

    What does that mean?

    With kind regards

    Marie

     

Sign In or Register to comment.