"R scripts"

frankiefrankie Member Posts: 26 Contributor II
edited May 2019 in Help

pardon my ignorance but I can seem to import a dataset and do a simle computation on this dataset using the "Execute R script" operator.

- First, can I connect any RM datset directly to the R operator?
- Is the logic behind the input that if I name it "MyData" I can reference to any variable from this dataset with the normal "MyData$variable_name" command? Ie. how it would be done in R.
- I try to do this but since it is a bit difficult to follow the very brief tutorial video I cannot understand what I'm doing wrong: all I get is an error "The data delivered by R in the variable {0} was not in the correct format for importing as an ExampleSet"

Thanks in advance,


  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Frankie,
    that's what the operator is designed for.

    You can export all data sets, but R might see the data in a different way. For example the time attribute will be exported as milliseconds after 1980 I think. So you have to be cautious about this.
    The data is then exported as a data frame under the variable name you enter. If you name it "MyData" you can use the normal methods of accessing a data frame called "MyData" in R.
    If you are again importing Data from the R Script, the given name of the variable must reference a DataFrame containing only Vectors and Factors. If the variable is called "MyImport" you can define "MyImport.label", too, which might refer to the column name that is used as label.

  • Options
    frankiefrankie Member Posts: 26 Contributor II
    Could someone please provide a simple example of how I can use R code on a RapidMiner dataset? An example is so much easier to understand, so please, if somebody could find the time..

    I've been trying to build a simple process that:

    1. Retrieve the Iris dataset that comes bundled with RM
    2. Use R code to sum two of the variables, say a1+a2

  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder

    sure, no problem. Here you go:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <operator activated="true" class="process" compatibility="5.0.8" expanded="true" name="Process">
        <process expanded="true" height="296" width="413">
          <operator activated="true" class="retrieve" compatibility="5.0.8" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
            <parameter key="repository_entry" value="//Samples/data/Iris"/>
          <operator activated="true" class="multiply" compatibility="5.0.8" expanded="true" height="94" name="Multiply" width="90" x="179" y="30"/>
          <operator activated="true" class="generate_attributes" compatibility="5.0.8" expanded="true" height="76" name="Generate Attributes" width="90" x="313" y="120">
            <list key="function_descriptions">
              <parameter key="Sum" value="a1+a2"/>
          <operator activated="true" class="r:execute_script_r" compatibility="5.0.1" expanded="true" height="76" name="Execute Script (R)" width="90" x="313" y="30">
            <parameter key="script" value="x &lt;- as.data.frame(c(data, data[1] + data[2]))&#10;colnames(x)[6] = &quot;Sum&quot;"/>
            <enumeration key="inputs">
              <parameter key="name_of_variable" value="data"/>
            <list key="results">
              <parameter key="x" value="Data Table"/>
          <connect from_op="Retrieve" from_port="output" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Execute Script (R)" to_port="input 1"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Generate Attributes" to_port="example set input"/>
          <connect from_op="Generate Attributes" from_port="example set output" to_port="result 2"/>
          <connect from_op="Execute Script (R)" from_port="output 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="72"/>
          <portSpacing port="sink_result 3" spacing="0"/>
Sign In or Register to comment.