Options

How to perform Time Series Prediction using Support Vector Machine for univariable.

mallsunita13mallsunita13 Member Posts: 4 Contributor I
edited November 2018 in Help

Hi,

 

I am new to RapidMiner. I am working on a project in which I am using Support Vector Machine operator to perform Time Series Prediction for univariable. I have a dataset which consists the information of nine non-consecutive weeks from February 2014 to October 2014. The name of the variable is "Total bytes" which is the size of emails. I am using Cross-Validation operator in which eight weeks data is training and ninth weeks data is testing.  

Can anyone please help me to solve this problem.

 

Thank you

 

 

Sunita 

Answers

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    Hi Sunita and welcome!

     

    Your qustion is a bit unclear--are you trying to use the prior week's value for total bytes to predict the last week's value?  If so, then you can use the Time Series operators (available in the free extension), in particular the "predict series" operator, although you many need to do a little bit of ETL to get your data into the correct format.

    If you are trying to predict "total bytes" as a function of other attributes, then you can use any of the other approaches in predictive modeling, as long as you are using attribute values from before the point in time that you are trying to predict.

     

    Recall that in either case you should set the role of "total bytes" to "label" using the "set role" operator so RapidMiner knows that is the attribute that you are trying to predict.

     

     

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    mallsunita13mallsunita13 Member Posts: 4 Contributor I

    Hello,

     

    Thank you for your response. I watched Thomas Ott's video regarding time series prediction and build a model. There is only one variable in my dataset that is "Total bytes" (size of emails). So I retrieve eight weeks data and then apply set role operator and set "Total bytes" as "Label". 

    When I am choosing windowing operator then I want to select "Series representation" as "encode-series-by-attribute" and windowing size "10" then it shows an error message that "The parameter window-size specifies a window size, but the value 10 exceeds the number of attributes".

    This same thing I applied to ninth weeks data and connect both (eight weeks and ninth week data) with "cross-validation" operator and "apply model".

    My question is 

    • Can I use "encode-series-by-attribute" operator instead of "encode-series-by-example" operator? and how to solve this error message?

    If I choose "Series representation" as "encode-series-by-example" and set windowing size as "10" and then run the process then it shows an error message under the subprocess of cross-validation (Apply model) that "the input ExampleSet does not match the training ExampleSet. Missing Attribute:'Total weeks-9=Week1'."

     

    Actually, I had nine excel files of email size and each file name represents the name of the week such as Week1, Week2,....,Week9 and each file have only one attribute "Total bytes". So I combined all files together and add second column which represents the Week to indicate that which size of email represents which week such as week1,...,week9. I want to use eight weeks data to predict the ninth week.

    Please guide me.

     

    Thanks

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    @mallsunita13  I am still not completely clear on the structure of your data, but I've attached a very simple process that produces a time series forecast using only a couple of operators.  This data structure has only two columns (attributes), a date and a value.  After setting the role and sorting the data, the predict series operator is used and you can see how the forecasts change by defining different window values. There isn't any need for any of the encode series transformations here.  Assuming your data is similar to the example file shown here you should be able to adapt this process for your own purposes.  Otherwise @Thomas_Ott may be able to answer questions about his original video or related series questions better than I can.

     

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Hi @mallsunita13 and @Telcontar120,

     

    So a couple of things. The encode by series by (attribute/examples) is a pretty critical one but it's simple to understand, all you need to know is how your time series is represented in your data. If your time series data is entered line by line (by examples) then you encode by examples, if your time series data is one long row (examples) and has 100's of columns (attributes), then you want to encode by attributes.

     

    Below are some other posts where I explain the windowing operator and SVM. 

     

    http://community.rapidminer.com/t5/RapidMiner-Studio/Time-Series-using-Windowing-operator-in-RapidMiner/m-p/31791

    http://community.rapidminer.com/t5/RapidMiner-Studio/Financial-Time-Series-Prediction/m-p/33456

     

    Good luck!

     

     

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    Thanks for the explanation @Thomas_Ott.

     

    Evidently, the predict series operator doesn't require this series transformation beforehand though, as the example process I provided produces a forecast even though I had not applied the "encode series by examples"--is that because it is performing that transformation automatically when there is only a single attribute in the dataset?  

     

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    I rarely use the Predict Series operator. Instead I just defauly to the Sliding Window Xval operator. 

  • Options
    mallsunita13mallsunita13 Member Posts: 4 Contributor I

    Dear Mr. Brian T and Thomas Ott,

     

    Thank you very much for your help. This information is very helpful to complete my assignment.

     

     

    Thanks

     

    Sunita

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    Glad it was helpful!  You may also be interested in using the "fit prediction" operator once you have converted your series to an exampleset.  This allows you to specify whatever modeling algorithm is suitable in the inner operator, such as linear regression or more complex functions like SVM or neural nets as well.

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.