The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.

Predict Values

chelanzchelanz Member Posts: 1 Learner III
edited November 2018 in Help
THIS IS A REPOST. THE POST BELOW ISN'T MINE BUT WE DO HAVE THE SAME DILEMMA AND NO ONE REPLIED TO IT. NOW I HOPE THIS POST COULD BE ANSWERED. THANK YOU! ALL EFFORTS WILL BE MUCH APPRECIATED! CHEERS, CHEL

Hi,

I am doing an academic project on stock prediction. while trying to figure out how SVM works, i bumped into rapid miner. I am using it since last 2 hours and i am not able to figure out how to predict values for future dates (horizon > 1). I increased the horizon size but then it shows me 1 future value for every value in input data (if horizon is 5, it shows me 1 value for every input which is suposed to be a predicted value on 5th day after current input). Is there any way by which i can display future values in proper sequence e.g. day 1 -  predicted value 1, day 2 - predicted value 2, etc.
also, is there any way by which I can improve the prediction accuracy Huh??
also, can i somehow incorporate such a particular prediction module in my java code for my GUI or should i call rapid miner explicitly from my java program Huh (i just want to use the SVm prediction module and not all the features of rapid miner)
It would be great if you can help me out

I am attaching here the XML of my test file


<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
  <context>
    <input>
      <location/>
    </input>
    <output>
      <location/>
      <location/>
      <location/>
      <location/>
    </output>
    <macros/>
  </context>
  <operator activated="true" class="process" expanded="true" name="Process">
    <process expanded="true" height="423" width="763">
      <operator activated="true" class="read_csv" expanded="true" height="60" name="Read CSV" width="90" x="45" y="30">
        <parameter key="file_name" value="C:\Users\Rj\Downloads\train.csv"/>
      </operator>
      <operator activated="true" class="set_role" expanded="true" height="76" name="Set Role" width="90" x="179" y="30">
        <parameter key="name" value="1"/>
        <parameter key="target_role" value="id"/>
      </operator>
      <operator activated="true" class="series:windowing" expanded="true" height="76" name="Windowing" width="90" x="313" y="30">
        <parameter key="horizon" value="5"/>
        <parameter key="window_size" value="1"/>
        <parameter key="create_label" value="true"/>
        <parameter key="label_attribute" value="564.08"/>
      </operator>
      <operator activated="true" class="series:sliding_window_validation" expanded="true" height="112" name="Validation" width="90" x="447" y="30">
        <parameter key="training_window_width" value="5"/>
        <parameter key="training_window_step_size" value="1"/>
        <parameter key="test_window_width" value="5"/>
        <process expanded="true">
          <operator activated="true" class="nominal_to_numerical" expanded="true" height="94" name="Nominal to Numerical" width="90" x="45" y="255"/>
          <operator activated="true" class="support_vector_machine" expanded="true" height="112" name="SVM" width="90" x="112" y="75">
            <parameter key="kernel_degree" value="5.0"/>
            <parameter key="C" value="1.0"/>
          </operator>
          <connect from_port="training" to_op="Nominal to Numerical" to_port="example set input"/>
          <connect from_op="Nominal to Numerical" from_port="example set output" to_op="SVM" to_port="training set"/>
          <connect from_op="SVM" from_port="model" to_port="model"/>
          <portSpacing port="source_training" spacing="0"/>
          <portSpacing port="sink_model" spacing="0"/>
          <portSpacing port="sink_through 1" spacing="0"/>
        </process>
        <process expanded="true">
          <operator activated="true" class="apply_model" expanded="true" height="76" name="Apply Model" width="90" x="66" y="30">
            <list key="application_parameters"/>
          </operator>
          <operator activated="true" class="series:forecasting_performance" expanded="true" height="76" name="Performance" width="90" x="195" y="25">
            <parameter key="horizon" value="1"/>
          </operator>
          <connect from_port="model" to_op="Apply Model" to_port="model"/>
          <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
          <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
          <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
          <portSpacing port="source_model" spacing="0"/>
          <portSpacing port="source_test set" spacing="0"/>
          <portSpacing port="source_through 1" spacing="0"/>
          <portSpacing port="sink_averagable 1" spacing="0"/>
          <portSpacing port="sink_averagable 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="read_csv" expanded="true" height="60" name="Read CSV (2)" width="90" x="45" y="255">
        <parameter key="file_name" value="C:\Users\Rj\Downloads\test.csv"/>
      </operator>
      <operator activated="true" class="set_role" expanded="true" height="76" name="Set Role (2)" width="90" x="179" y="255">
        <parameter key="name" value="1"/>
        <parameter key="target_role" value="id"/>
      </operator>
      <operator activated="true" class="series:windowing" expanded="true" height="76" name="Windowing (2)" width="90" x="313" y="255">
        <parameter key="window_size" value="1"/>
        <parameter key="label_attribute" value="562.21"/>
      </operator>
      <operator activated="true" class="apply_model" expanded="true" height="76" name="Apply Model (2)" width="90" x="492" y="261">
        <list key="application_parameters"/>
      </operator>
      <connect from_op="Read CSV" from_port="output" to_op="Set Role" to_port="example set input"/>
      <connect from_op="Set Role" from_port="example set output" to_op="Windowing" to_port="example set input"/>
      <connect from_op="Windowing" from_port="example set output" to_op="Validation" to_port="training"/>
      <connect from_op="Validation" from_port="model" to_op="Apply Model (2)" to_port="model"/>
      <connect from_op="Validation" from_port="training" to_port="result 1"/>
      <connect from_op="Validation" from_port="averagable 1" to_port="result 2"/>
      <connect from_op="Read CSV (2)" from_port="output" to_op="Set Role (2)" to_port="example set input"/>
      <connect from_op="Set Role (2)" from_port="example set output" to_op="Windowing (2)" to_port="example set input"/>
      <connect from_op="Windowing (2)" from_port="example set output" to_op="Apply Model (2)" to_port="unlabelled data"/>
      <connect from_op="Apply Model (2)" from_port="labelled data" to_port="result 3"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
      <portSpacing port="sink_result 4" spacing="0"/>
    </process>
  </operator>
</process>


csv files i used had following data
train.csv
1  GOOG  564.08  564.78  561.01  565.18
2  GOOG  562.48  564.78  561.01  565.18
3  GOOG  562.76  559.46  558.71  564.66
4  GOOG  562.3  559.46  558.71  564.66
5  GOOG  562.17  559.46  558.71  564.66
6  GOOG  562.08  559.46  558.71  564.66
7  GOOG  561.658  559.46  558.71  564.66
8  GOOG  561.52  559.46  558.71  564.66
9  GOOG  560.548  559.46  558.71  564.66
10  GOOG  560.19  559.46  556.5  564.66
11  GOOG  562.77  563.75  562.4  564.22
12  GOOG  564.95  563.75  562.21  565.85
13  GOOG  566.87  563.75  562.21  568
14  GOOG  571.01  563.75  562.21  571.22
15  GOOG  571.89  563.75  562.21  571.909
16  GOOG  570.8115  563.75  562.21  572
17  GOOG  567.34  563.75  562.21  572
18  GOOG  569.2  563.75  562.21  572
19  GOOG  570.73  563.75  562.21  572
20  GOOG  570.13  563.75  562.21  572
21  GOOG  572.16  563.75  562.21  572.2

test.csv
1  GOOG  575.22  563.75  562.21  575.25
2  GOOG  575.16  563.75  562.21  578.5

I wanted to predict future values (for next 10 days) using input from test.csv. Is there any way by which I can predict all 10 values (with as high accuracy as possible) and display them too Huh
Sign In or Register to comment.