Stock market price forecast

nazaninnazanin Member Posts: 6 Newbie
edited June 2019 in Help
I just joined the community
I am in the beginner's  Rapidminer program
I have stock data for an index
I want to predict the final price for that indicator in the next few days
But I do not know how to use which operators?
Is there any need to normalize the data?
I ask you to help me
Yours sincerely



  • hughesfleming68hughesfleming68 Member Posts: 323 Unicorn
    edited April 2019
    It depends but in most cases yes. You will need to normalize the data. There is a normalize operator and a normalize series operator in Rapidminer. You should also consider detrending using a moving average using the formula close - ma(close). First order differencing is another way. Detrending time series data is more complicated than it first appears. You will have to experiment to see what works best for your model.

    Forecasting stock prices is a bit of a black art with a generally poor win rate. If you are starting out, I would use a data set that is much more regular. The problem is noise which is random.
  • nazaninnazanin Member Posts: 6 Newbie
    Thank you so much
    I did not notice this sentence and the formula mentioned   "You should also consider detrending using a moving average using the formula close - ma (close). "
    Can you explain more? Is there a typical process for normalizing and forecasting stock prices? thanks again <3
  • hughesfleming68hughesfleming68 Member Posts: 323 Unicorn
    edited April 2019
    There is nothing typical about forecasting stock prices....but.... if you must try then taking a look at Thomas Ott's YouTube videos on time series forecasting will get you going. I suggest those because he goes into detail about windowing and validation that is understandable for people new to Rapidminer.

    I think you should have a quick think about what it is that you are trying to predict. If it is direction then you can look at the problem from a classification perspective. If it is an actual value then it is a regression problem. A value may be an actual price or the percentage change. Take a simple one step ahead prediction.....what does that mean exactly? It could be a prediction from the next days open to the next days close or from the open to the following days's open or from todays close to tomorrows close, next week or next months close. This may sound subtle but it will have a dramatic effect on your prediction so choosing what to predict is fundamental.

    You can generate attributes in Rapidminer. These are features that provide "information" to your learning operators. You can setup your process as a univariate or multivariate problem, multivariate is simply using several features to guide your prediction. Creating a moving average or adding a volatility measurement would be adding features to your process.

    Lastly, you need to think about how much data to use. Does using daily data going back to 1990 help you to predict tomorrows price? Probably not. In finance, more data might not be better if the data is no longer relevant. Even more problematic is that even if you do find a way to achieve high accuracy, that does not always equal high profitability. It could be that your high accuracy is still random. There are many traps to fall into.
  • nazaninnazanin Member Posts: 6 Newbie
    edited April 2019
    Hello my friend
    Thank you very much for your help <3
    it was kind of you
    Sorry, i will take your time again
    My data is for a specific index. As for a specific index, I have data for the last 6 months.
    And I want to predict it for the next week, as well as the next month, the price of opening or closing.
    My problem is that I do not know which attribute should I work on and which operators I need to get help with. I know the classification and clustering in the
    Rapidminer program. But I do not know things like regression and time series.
    And I do not know how to predict the price?
    I have no experience in this field. And I am quite a beginner in this.
    And I'm completely confused.
    I also did not find an example in the Rapidminer program.
    May I ask you to give me an example?(I also say that I have a great demand from the community. But there's really no one to help me out. Sorry again)
    thanks again
    best regards

  • hughesfleming68hughesfleming68 Member Posts: 323 Unicorn
    edited April 2019
    Take a look at the videos on time series. Unfortunately, there are no shortcuts. You have to go through and do the tutorials and experiment on your own. If you get stuck, post your process and someone will help you. Those videos explain the basics. There are other time series examples you can study in the samples section but you need to know the terminology. Everyone finds starting challenging.

    Are you @student_compute? You write the same way and have similar problems. Here is a whole thread full of people trying to help you on this already.
  • nazaninnazanin Member Posts: 6 Newbie
    thanks again
    Yes, Sure
    I will do my best
    And I'm trying to create a process. And share it with the community to fix its possible bugs
    best regards

  • nazaninnazanin Member Posts: 6 Newbie
    I studied all yesterday on a series of times
    I realized that. If I understand correctly:
    The series has four types
        Irregular remainder
    And that
    The ARIMA model is used to predict time series.
    And is ARIMA (p, d, q).
    There are two BIC and AIC parameters for measuring the model that should be at least minimal.
     The mean and the variance of the time series must be constant.
    It is true?

    Now my questions are:
    In the RapidMiner program
    For stock data
    1. How should I identify the type of time series in the chart of a stock index?
    2. How should I identify the appropriate parameter values ​​in ARIMA (p, d, q)?
    3. How to prove the mean values ​​and variance of the time series?

    I ask you to guide

  • hughesfleming68hughesfleming68 Member Posts: 323 Unicorn
    edited April 2019
    Hello nazanin. It is good that you are studying ARIMA and there is a lot of information on the net. Have you decided what you are trying to forecast? Consider this, you have six months of data. That is roughly 121 trading days taking out the weekends. Is ARIMA the right tool for the job? How much seasonality to do you have in 121 days? Why have you chosen ARIMA over a Multi-layer Perception for example? You need to understand your data before you start and ask yourself these questions.

    Back to ARIMA. You should study the example in the samples section of Rapdiminer, specifically "Example Analysis of Lake Huron" and look and see for yourself what adjustments p,q,d  have on your forecast. You should also prepare a process with windowing. Look at the "Create Model for Gas Prices.". These two examples will cover the basics.

    As for the other three questions.....that is your homework Nazanin. No short cuts.  :)

    Here is a tutorial for R that you might want to look at. It goes through the steps quite clearly. The steps are the same regardless of what software you use.

    Keep in mind that the real world is much more complex. As a learning exercise this is fine. Just don't expect to able to forecast with any kind of accuracy with this approach.
  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager
    I think I am going to steal this @hughesfleming68

    As for the other three questions.....that is your homework. No short cuts.  
  • nazaninnazanin Member Posts: 6 Newbie
    Thank you for your answer <3
    I will do my best
    But with this description
    What is your recommendation for predicting stock prices?
    What is your opinion? (Because you have the full information of all the algorithms available, I'll ask you which algorithm to search and study is more appropriate.)

  • hughesfleming68hughesfleming68 Member Posts: 323 Unicorn
    edited April 2019
    Hi Nazanin,

    I think you should do the ARIMA example and also the windowing example and compare the prediction. With the windowing example you can see what Gradient Boosted Trees does and then try different learning operators. Try different methods of detrending. The best learner for the job is very data dependent so there is no best learner. You need to discover what works best for your data. It also depends on what you want to predict. You also need to think about your class balance which means you have more up days than down days. You need to think about serial correlation which you have many up days or down days in a row or not.

    Here is the primary problem with stock prices which have a low signal to noise ratio. If you are trying to predict the next days closing price it is very common that your learner will decide that the best forecast of tomorrows price is todays price. So if today was up then tomorrow will be up and visa versa. In the end you are not predicting anything. It becomes an exercise in over fitting. So you think....I am not going to over fit. What you end up with is a complicated way of doing something simple like a moving average.

    More sophisticated models will try and extract the temporal/cyclical data in the time series and make a projection to determine the directional bias. You would also need to learn how to make multi day predictions which adds another layer of complication. Stock prices are a poor choice for beginners to use as a data set.

    Try your best to duplicate the examples and look at the videos but don't be disappointed if your accuracy is near random. That would be the expected result. Use a default model as your baseline and compare your results.
Sign In or Register to comment.