Arima & R

MaerkliMaerkli Member Posts: 84 Guru
edited December 2018 in Help

Hallo,

I try to replicate an Arima example found at www.rapidminer.com. Here is the XML file:

<?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.4.000" expanded="true" name="Process">
    <parameter key="logverbosity" value="init"/>
    <parameter key="random_seed" value="2001"/>
    <parameter key="send_mail" value="never"/>
    <parameter key="notification_email" value=""/>
    <parameter key="process_duration_for_mail" value="30"/>
    <parameter key="encoding" value="SYSTEM"/>
    <process expanded="true">
      <operator activated="true" class="quantx1:yahoo_historical_data_extractor" compatibility="1.0.006" expanded="true" height="82" name="Yahoo Historical Stock Data" width="90" x="45" y="120">
        <parameter key="I agree to abide by Yahoo's Terms &amp; Conditions on financial data usage" value="true"/>
        <parameter key="Quick Stock Ticker Data" value="true"/>
        <parameter key="Stock Ticker" value="S&amp;P"/>
        <parameter key="select_fields" value="VOLUME|OPEN|DAY_LOW|DAY_HIGH|CLOSE|ADJUSTED_CLOSE"/>
        <parameter key="date_format" value="yyyy-MM-dd"/>
        <parameter key="date_start" value="2013-01-01"/>
        <parameter key="date_end" value="2015-06-03"/>
        <parameter key="data_frequency" value="DAILY"/>
        <parameter key="Cache Data in Memory" value="false"/>
      </operator>
      <operator activated="true" class="rename" compatibility="7.4.000" expanded="true" height="82" name="Rename" width="90" x="179" y="120">
        <parameter key="old_name" value="S&amp;P_ADJUSTED_CLOSE"/>
        <parameter key="new_name" value="AClose"/>
        <list key="rename_additional_attributes">
          <parameter key="S&amp;P_CLOSE" value="Close"/>
          <parameter key="S&amp;P_DAY_HIGH" value="High"/>
          <parameter key="S&amp;P_DAY_LOW" value="Low"/>
          <parameter key="S&amp;P_OPEN" value="Open"/>
          <parameter key="S&amp;P_VOLUME" value="Volume"/>
        </list>
      </operator>
      <operator activated="true" class="multiply" compatibility="7.4.000" expanded="true" height="124" name="Multiply" width="90" x="313" y="120"/>
      <operator activated="true" class="r_scripting:execute_r" compatibility="7.2.000" expanded="true" height="82" name="Forecasting" width="90" x="715" y="435">
        <parameter key="script" value="### Call this R scripts to get AIC from ARIMA models&#10;rm_main = function(data)&#10;{&#10;    &#9;library(forecast)&#10;    &#9;sp &lt;- data&#10;&#9;sp$Date &lt;- as.Date(sp$Date)&#10;&#9;arima &lt;- arima(ts(sp$Close), order=c(3,1,3))&#10;&#9;print(arima)&#10;&#9;arimaforecast &lt;- forecast.Arima(arima, h=5)&#10;&#9;print(arimaforecast)&#10;    &#9;return(as.data.frame(arimaforecast))&#10;}&#10;"/>
      </operator>
      <operator activated="true" class="optimize_parameters_grid" compatibility="7.4.000" expanded="true" height="103" name="Optimize Parameters (Grid)" width="90" x="514" y="300">
        <list key="parameters">
          <parameter key="Set p.value" value="[0;3;3;linear]"/>
          <parameter key="Set d.value" value="[0.0;2;2;linear]"/>
          <parameter key="Set q.value" value="[0.0;4;4;linear]"/>
        </list>
        <parameter key="error_handling" value="fail on error"/>
        <process expanded="true">
          <operator activated="true" class="set_macro" compatibility="7.4.000" expanded="true" height="76" name="Set p" width="90" x="112" y="30">
            <parameter key="macro" value="p"/>
            <parameter key="value" value="3.0"/>
          </operator>
          <operator activated="true" class="set_macro" compatibility="7.4.000" expanded="true" height="76" name="Set d" width="90" x="112" y="120">
            <parameter key="macro" value="d"/>
            <parameter key="value" value="2.0"/>
          </operator>
          <operator activated="true" class="set_macro" compatibility="7.4.000" expanded="true" height="76" name="Set q" width="90" x="112" y="210">
            <parameter key="macro" value="q"/>
            <parameter key="value" value="4.0"/>
          </operator>
          <operator activated="true" class="r_scripting:execute_r" compatibility="7.2.000" expanded="true" height="112" name="ARIMA" width="90" x="447" y="75">
            <parameter key="script" value="### Call this R scripts to get AIC from ARIMA models&#10;rm_main = function(data)&#10;{&#10;    &#9;sp &lt;- data&#10;&#9;sp$Date &lt;- as.Date(sp$Date)&#10;&#9;arima &lt;- arima(sp$Close, order=c(%{p},%{d},%{q}))&#10;&#9;#print(arima$aic)&#10;    &#9;return(as.data.table(arima$aic))&#10;}&#10;"/>
            <description align="center" color="transparent" colored="false" width="126">Fit ARIMA model in R with diffeferent(p,d,q)</description>
          </operator>
          <operator activated="true" class="extract_performance" compatibility="7.4.000" expanded="true" height="76" name="Performance" width="90" x="581" y="75">
            <parameter key="performance_type" value="data_value"/>
            <parameter key="statistics" value="average"/>
            <parameter key="attribute_name" value="V1"/>
            <parameter key="example_index" value="1"/>
            <parameter key="optimization_direction" value="minimize"/>
          </operator>
          <operator activated="true" class="log" compatibility="7.4.000" expanded="true" height="76" name="Log" width="90" x="715" y="75">
            <list key="log">
              <parameter key="aic" value="operator.Performance.value.performance"/>
              <parameter key="p" value="operator.Set p.parameter.value"/>
              <parameter key="d" value="operator.Set d.parameter.value"/>
              <parameter key="q" value="operator.Set q.parameter.value"/>
            </list>
            <parameter key="sorting_type" value="none"/>
            <parameter key="sorting_k" value="100"/>
            <parameter key="persistent" value="false"/>
          </operator>
          <connect from_port="input 1" to_op="Set p" to_port="through 1"/>
          <connect from_op="Set p" from_port="through 1" to_op="ARIMA" to_port="input 1"/>
          <connect from_op="Set d" from_port="through 1" to_op="ARIMA" to_port="input 2"/>
          <connect from_op="Set q" from_port="through 1" to_op="ARIMA" to_port="input 3"/>
          <connect from_op="ARIMA" from_port="output 1" to_op="Performance" to_port="example set"/>
          <connect from_op="Performance" from_port="performance" to_op="Log" to_port="through 1"/>
          <connect from_op="Log" from_port="through 1" to_port="performance"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="source_input 2" spacing="0"/>
          <portSpacing port="sink_performance" spacing="36"/>
          <portSpacing port="sink_result 1" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Yahoo Historical Stock Data" from_port="example set" to_op="Rename" to_port="example set input"/>
      <connect from_op="Rename" from_port="example set output" to_op="Multiply" to_port="input"/>
      <connect from_op="Multiply" from_port="output 1" to_port="result 1"/>
      <connect from_op="Multiply" from_port="output 2" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
      <connect from_op="Multiply" from_port="output 3" to_op="Forecasting" to_port="input 1"/>
      <connect from_op="Forecasting" from_port="output 1" to_port="result 3"/>
      <connect from_op="Optimize Parameters (Grid)" from_port="performance" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="90"/>
      <portSpacing port="sink_result 2" spacing="162"/>
      <portSpacing port="sink_result 3" spacing="126"/>
      <portSpacing port="sink_result 4" spacing="36"/>
      <description align="center" color="yellow" colored="false" height="62" resized="true" width="816" x="305" y="18">Look at Economic Time Series Data (automatically pulled) from public sites and integrate with ARIMA in R extension</description>
      <description align="center" color="yellow" colored="false" height="133" resized="true" width="635" x="490" y="83">Charts for data. Identify any unusual observations for all attributes: day low, high, open, close, adjusted close, volumn</description>
      <description align="center" color="yellow" colored="false" height="177" resized="true" width="626" x="500" y="228">Find the optimized parameter for ARIMA (iterative, and TAKE TIME!! about 1 min)&lt;br&gt;Use R extension for ARIMA models&lt;br&gt;for this demo data, we have ARIMA(3,1,3) as the best fit&lt;br/&gt;To chose the best fit model: check Log result, rank by AIC&lt;br/&gt;and find the values of p, d, q corresponding to min AIC</description>
      <description align="center" color="yellow" colored="false" height="116" resized="true" width="415" x="713" y="414">Apply ARIMA(3,1,3) for forcasting&lt;br&gt;predict the next 5 days close price&lt;br&gt;</description>
    </process>
  </operator>
</process>

 

Question 1: At Rename, the message "Attribute not found" pops up. I try to modify the parameters but issue remains. Finally, I removed this operator.

Question 2: With operator Rename removed, the message "Rscript not found" shows up at Forescasting (Execute R) operator.

R version 3.5.0 (2018-04-23) is installed.

Can anyone help?

Regards,

Maerkli

 

Tagged:

Best Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Solution Accepted

    Hallo @Maerkli,

     

    To answer to Question 1 :

    It seems that the Yahoo Historical Stock Data operator is down : When I set a "breapoint after" on this operator

    the resulting example set is empty, so the Rename operator can't be executed and raise the error you described.

     

    Regards,

     

     

    Lionel

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Solution Accepted

    Hi 

     

    Effectively, this extension is dead. My memory is failing...:smileysad: : We talked about that in this thread.

     

    @Maerkli, inside this thread, you can find a link towards the presentation of the Alpha Vantage API (an alternative to Yahoo API).

    This API could be useful if you are looking for financial data.

     

    Regards,

     

    Lionel

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
    Solution Accepted

    @lionelderkrikor In search engines we trust. My memory is long gone. 

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Solution Accepted

    You can also check out Quandl if you are willing to pay for a premium subscription, they have very complete financial data available in friendly APIs that work well with RapidMiner.

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • MaerkliMaerkli Member Posts: 84 Guru
    Solution Accepted

    Hallo RapidMiner Community,

     

    Un grand merci for your coopetation.

    Maerkli

Answers

  • MaerkliMaerkli Member Posts: 84 Guru

    It is true, Lionel. My data file is empty. Thanks for the support.

    Maerkli

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @Maerkli and @lionelderkrikor there's been a lot of discussion on the community (go search for it) about the old Fin/Econ extension. 1) It's dead, no one is updating it anymore and 2) Yahoo changed it's internals to make it harder to extract stock data.

     

    So you're left with using something else or manually downloading the stock prices via a CSV and loading it in via a Read CSV operator. 

  • websiteguywebsiteguy Member Posts: 24 Maven

    http://investexcel.net/multiple-stock-quote-downloader-for-excel/

     

    this still works its great you can download stock data for say 100 AIM Tickers, and each gets printed to its own csv.

     

    you can merge and find say, all stocks under X with avarage volume Y over N days

     

    to quickly wittle down Stocks with growing interest that are Low cost.

     

    I picked ELan oil and gas with this method @ 0.27   touched 1.37  couple of days ago

     

    Have not incoorporated in rapidminer, would be great to load 100s of ticker data and perform some filtering analysis on them ...

  • websiteguywebsiteguy Member Posts: 24 Maven

    Hi just a quick update, you have to click the "get stock quote" button twice sometimes to get the cookie it seems..

     

    London stock exchange AIM List Tickers

     

    you can get cryptos like this using the spreadsheet

    XRP-USD
    BTC-USD
    ETH-USD

     

     

     

    LSE.csv 200.3K
Sign In or Register to comment.