Why date data is missing after output from Execute R

zeno_maszeno_mas Member Posts: 4 Contributor I
edited November 2018 in Help

Hi 

I am trying to pass the data table to Execute R,  and want to get back with extra additional attributes generated by R. But when I pass data table to Execute R and get the out put form Execute R, found out that Date attribute is missing.

1. Save the data in local repositiory with date data type.

2. Just simply multiply (output directly and the other pass to Execute R)

3. Simple do nothing Execute R script

4. Output from R script

5. Output from direct Multiply

Anyone could give me an advice, how I can get the data table as it is from Execute R Script.

 

Thanks.
Rapidminer_ExecuteR.png

Tagged:

Answers

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hello @zeno_mas - could you please post your process so we can take a look at it?  Please use the </> tool above.

     

    Thanks.

    Scott

  • zeno_maszeno_mas Member Posts: 4 Contributor I

    <?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.6.001" expanded="true" name="Process">
    <parameter key="logverbosity" value="init"/>
    <parameter key="random_seed" value="2001"/>
    <parameter key="send_mail" value="never"/>
    <parameter key="notification_email" value=""/>
    <parameter key="process_duration_for_mail" value="30"/>
    <parameter key="encoding" value="SYSTEM"/>
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="7.6.001" expanded="true" height="68" name="Retrieve Data" width="90" x="45" y="34">
    <parameter key="repository_entry" value="../data/AMZN_Historical_dt"/>
    </operator>
    <operator activated="true" class="multiply" compatibility="7.6.001" expanded="true" height="103" name="Multiply (3)" width="90" x="179" y="34"/>
    <operator activated="true" class="r_scripting:execute_r" compatibility="7.2.000" expanded="true" height="82" name="Execute R" width="90" x="313" y="85">
    <parameter key="script" value="# rm_main is a mandatory function, &#10;# the number of arguments has to be the number of input ports (can be none)&#10;rm_main = function(data)&#10;{&#10;&#9;return(data)&#10;}&#10;"/>
    </operator>
    <connect from_op="Retrieve Data" from_port="output" to_op="Multiply (3)" to_port="input"/>
    <connect from_op="Multiply (3)" from_port="output 1" to_port="result 1"/>
    <connect from_op="Multiply (3)" from_port="output 2" to_op="Execute R" to_port="input 1"/>
    <connect from_op="Execute R" from_port="output 1" to_port="result 2"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    </process>
    </operator>
    </process>

    @sgenzer Thanks for the quick.

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Try converting your date column from a RapidMiner Date type to Polynominal type. 

     

    Sometimes when converting from RM > R, the date times get wonky. 

  • zeno_maszeno_mas Member Posts: 4 Contributor I

    Thank you for your suggestion @Thomas_Ott.

    Yep, that is one of the workable workaround, in fact I actually started with that and inside Excute R still can detact as date data type.

    Do you think it is worth to report an issue to RM team?

     

    Rgds,

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hi @zeno_mas - just curious.  What are you trying to do in R that cannot be done with RapidMiner operators?

     

    Scott

  • imarkouimarkou Member Posts: 6 Contributor I

    Hi @sgenzer,

     

    I know the post is old but I had a similar problem.

    After running a simple R script where the input example set contains a Date time attribute, I get the following error:

    Exception: com.rapidminer.operator.OperatorException
    Message: Script terminated abnormally.
    Stack trace:

    com.rapidminer.extension.rscripting.operator.scripting.AbstractScriptRunner.run(AbstractScriptRunner.java:166)
    com.rapidminer.extension.rscripting.operator.scripting.AbstractScriptingLanguageOperator.doWork(AbstractScriptingLanguageOperator.java:90)
    com.rapidminer.extension.rscripting.operator.scripting.r.RScriptingOperator.doWork(RScriptingOperator.java:73)
    com.rapidminer.operator.Operator.execute(Operator.java:1025)
    com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:77)
    com.rapidminer.operator.ExecutionUnit$2.run(ExecutionUnit.java:812)
    com.rapidminer.operator.ExecutionUnit$2.run(ExecutionUnit.java:807)
    java.security.AccessController.doPrivileged(Native Method)
    com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:807)
    com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:428)
    com.rapidminer.operator.Operator.execute(Operator.java:1025)
    com.rapidminer.Process.execute(Process.java:1322)
    com.rapidminer.Process.run(Process.java:1297)
    com.rapidminer.Process.run(Process.java:1183)
    com.rapidminer.Process.run(Process.java:1136)
    com.rapidminer.Process.run(Process.java:1131)
    com.rapidminer.Process.run(Process.java:1121)
    com.rapidminer.gui.ProcessThread.run(ProcessThread.java:65)

    The same error occured when using Date attributes. When I convert the date attribute to nominal, the problem is solved. I'm just getting started with the "Execute R" operator and in this process I used it to simply output the ExampleSet to the RapidMiner results.

    My process is as follows:

     

    <?xml version="1.0" encoding="UTF-8"?><process version="9.0.002">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="9.0.002" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" breakpoints="after" class="subprocess" compatibility="9.0.002" expanded="true" height="82" name="Create ExampleSet" width="90" x="45" y="34">
    <process expanded="true">
    <operator activated="true" class="generate_data" compatibility="9.0.002" expanded="true" height="68" name="Generate Data (2)" width="90" x="45" y="34">
    <parameter key="number_examples" value="10"/>
    <parameter key="number_of_attributes" value="1"/>
    <parameter key="attributes_lower_bound" value="1.0"/>
    </operator>
    <operator activated="true" class="real_to_integer" compatibility="9.0.002" expanded="true" height="82" name="Real to Integer" width="90" x="179" y="34"/>
    <operator activated="true" class="generate_attributes" compatibility="9.0.002" expanded="true" height="82" name="Generate Attributes" width="90" x="313" y="34">
    <list key="function_descriptions">
    <parameter key="date" value="date_add(date_now(), att1, DATE_UNIT_DAY)"/>
    </list>
    </operator>
    <connect from_op="Generate Data (2)" from_port="output" to_op="Real to Integer" to_port="example set input"/>
    <connect from_op="Real to Integer" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
    <connect from_op="Generate Attributes" from_port="example set output" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="false" class="date_to_nominal" compatibility="9.0.002" expanded="true" height="82" name="Date to Nominal" width="90" x="246" y="85">
    <parameter key="attribute_name" value="date"/>
    <parameter key="date_format" value="dd/MM/yyyy"/>
    </operator>
    <operator activated="true" class="r_scripting:execute_r" compatibility="8.1.000" expanded="true" height="82" name="Execute R" width="90" x="447" y="34">
    <parameter key="script" value="rm_main = function(data)&#10;{&#10; print('Hello, world!')&#10; return(list(data))&#10;}&#10;"/>
    </operator>
    <connect from_op="Create ExampleSet" from_port="out 1" to_op="Execute R" to_port="input 1"/>
    <connect from_op="Execute R" from_port="output 1" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

    The reason I'm using R is that I want to perform STL (Seasonal and Trend decomposition using Loess) on a time series and I didn't find a relevant operator in RapidMiner.

     

    Thanks,

    John

  • tftemmetftemme Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, Member Posts: 164 RM Research

    Hi @imarkou,

     

    Just a small teaser concerning the STL Decomposition. With the next release of RapidMiner Studio we will add an operator capable of performing STL.

     

    Best regards,
    Fabian

    sgenzerMartinLiebigyyhuang
  • imarkouimarkou Member Posts: 6 Contributor I

    Hi @tftemme,

     

    Great to hear that! It will be interesting to give it a try when it's released!

     

    Regarding the problem, as @Thomas_Ott and @zeno_mas mentioned, converting date into polynominal is a solution to the problem. Even when converting date time to polynominal, R recognises the data as POSIXct which is what I wanted for analysing time series data.

     

    However, I was wondering if the exception in my process is because I'm trying to pass data that is not supported by the Execute R operator or due to a bug.

     

    Best regards,

    John

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    cc'ing our resident R expert @yyhuang :)

     

     

  • yyhuangyyhuang Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 364 RM Data Scientist

    Hi @imarkou,

     

    Thanks for the followup.

     

    As you said R recognises the data as POSIXct. The special classes for date and time in R are C-based. While the date class in RapidMiner is Java based.

    See also about the issues when you convert dates between different systems

    https://www.rdocumentation.org/packages/base/versions/3.5.1/topics/as.Date

    we suggest you use as.character() function to covert date to characters.

    Page 8 on this R news gives detailed explaination about the development of date class in R.

    Example process:

    <?xml version="1.0" encoding="UTF-8"?><process version="9.0.002">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="9.0.002" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="subprocess" compatibility="9.0.002" expanded="true" height="82" name="Create ExampleSet" width="90" x="45" y="34">
    <process expanded="true">
    <operator activated="true" class="generate_data" compatibility="9.0.002" expanded="true" height="68" name="Generate Data (2)" width="90" x="45" y="34">
    <parameter key="number_examples" value="10"/>
    <parameter key="number_of_attributes" value="1"/>
    <parameter key="attributes_lower_bound" value="1.0"/>
    </operator>
    <operator activated="true" class="real_to_integer" compatibility="9.0.002" expanded="true" height="82" name="Real to Integer" width="90" x="179" y="34"/>
    <operator activated="true" class="generate_attributes" compatibility="9.0.002" expanded="true" height="82" name="Generate Attributes" width="90" x="313" y="34">
    <list key="function_descriptions">
    <parameter key="date" value="date_add(date_now(), att1, DATE_UNIT_DAY)"/>
    </list>
    </operator>
    <connect from_op="Generate Data (2)" from_port="output" to_op="Real to Integer" to_port="example set input"/>
    <connect from_op="Real to Integer" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
    <connect from_op="Generate Attributes" from_port="example set output" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="date_to_nominal" compatibility="9.0.002" expanded="true" height="82" name="Date to Nominal" width="90" x="246" y="34">
    <parameter key="attribute_name" value="date"/>
    <parameter key="date_format" value="yyyy-MM-dd"/>
    </operator>
    <operator activated="true" class="r_scripting:execute_r" compatibility="8.1.000" expanded="true" height="82" name="Execute R" width="90" x="447" y="34">
    <parameter key="script" value="rm_main = function(data)&#10;{&#10;&#9;print(data)&#10; return(list(as.data.frame(data)))&#10;}&#10;"/>
    </operator>
    <operator activated="true" class="r_scripting:execute_r" compatibility="8.1.000" expanded="true" height="82" name="Execute R (2)" width="90" x="380" y="187">
    <parameter key="script" value="# rm_main is a mandatory function, &#10;# the number of arguments has to be the number of input ports (can be none)&#10;rm_main = function()&#10;{&#10; &#9;dat &lt;- data.frame(myts = sample(10, 24, replace = T), Date = seq(as.Date(&quot;2008-09-11&quot;), as.Date(&quot;2008-09-11&quot;) + 23, by = 1))&#10; &#9;dat$Date &lt;-as.character(dat$Date)&#10; &#9;return(list(dat))&#10;}&#10;"/>
    </operator>
    <connect from_op="Create ExampleSet" from_port="out 1" to_op="Date to Nominal" to_port="example set input"/>
    <connect from_op="Date to Nominal" from_port="example set output" to_op="Execute R" to_port="input 1"/>
    <connect from_op="Execute R" from_port="output 1" to_port="result 1"/>
    <connect from_op="Execute R (2)" from_port="output 1" to_port="result 2"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    </process>
    </operator>
    </process>

     

    YY

    sgenzerMartinLiebig
  • imarkouimarkou Member Posts: 6 Contributor I

    Thanks a lot for the detailed explanation @yyhuang!

    sgenzer
Sign In or Register to comment.