The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
💬0 Comments | 🔥0 Discussions | 👤0 Members | 🔌0 Online |
Answers
Dortmund, Germany
I'm looking at the p and q values in the Optimisation Grid of you APPL ARIMA process but I don't see any p or q value settings that = 0 and would cause the "0 is not allowed" warnings? The minimums are set to start from 1 not 0? What am I meant to change and to what values please?
Cheers,
Dortmund, Germany
Anyway I went to change 1 to 0 for p and q but it made no difference as I still got the same warning?
yes, I confirm that if p=0 AND q=0 an error is raising by RM .(I used the Automized ARIMA on US - Consumption data process from the time series templates)
There is an easy palliative solution :
In the Optimize Parameters operator, set the parameter error handling = ignore error
hope this helps,
Regards,
Lionel
PS : if this solution does not fix your error, please share your process and your data.
EDIT :
for your use case, I think you can only do your parameters search with these values :
p = 1,2,3,4,5
d = 0,1,2
q = 0,1,2,3,4,5
Thanks very much, I set "error handling" to "ignore" and this eliminated the p,d,q warning but the process stops and my Macbook produces the coloured wheel at 84% every time I run it (it's a large Dow Jone data file with 5000 rows plus many technical indicators). I don't get any error message but the only way to resolve the issue is to keep force quitting RM?
Thanks, In this particular run I am using daily 2000-2020 (which includes the 2003, 2007 and 2020 crashes). I have a 20 day window size and 5 day horizon.
It's the optimisation process that is causing the freeze up as ARIMA works with this data set when not optimising?
In my opinion it is too much data. There is very little signal in price data alone. It is better to think about the problem along the lines of what is driving price now. Certainly nothing that happened in 2008.
Perhaps it is better to not think about predicting price but just positive or negative bias over the period you are interested in. You might want to look at this as a classification problem.
I think prediction has a place in algorithmic trading but only over very short time frames. Certainly not days and not all the time. The signal to noise ratio is very variable, a lot of the time, your prediction will be random. You will have to be ready to be wrong a lot.
I also think that Arima is the wrong tool for this particular job. Price action is too irregular.
I take your point about the amount of data, I've seen people discuss this issue and read research papers where a couple of years of data (and more) was used. I've not heard of using even less than a few days.
Re: ARIMA, I was under the impression and encouraged that it was possible to get good predictions by Fabian's video Elaborate Your Time Series video: https://www.youtube.com/watch?v=Hvdh8ItfiGA&ab_channel=RapidMiner%2CInc.
I'm also working with a Random Forest process but just can't get it to predict any point beyond my data set using Apply Model operator that Martin had suggested along with the Lag operator.
I'd be interested to know which algorithms did you find worked best for this type of task and did you aim at getting predictions based on classification of actual price targeting?
Many thanks,
I think you need to review which machine learning operators can extrapolate. Random Forest won't be able to do that. I would also be cautious with academic papers. I have looked at many that have been flawed in one way or another or have had results that are not reproducible. In the end, you are going to have to test everything yourself. Keep in mind, that even if you solve your prediction problems, you still need to transfer your prediction to the real world. Your prediction may look good after validation but still lose money when you try and implement it. There are two parts to this problem, the prediction and the strategy to execute it. Don't underestimate how difficult the second part is.
I still think classification is a better approach and this can be a pure classification task where you label your data and then predict the class or you can take a regression and turn it into a classification problem just by averaging the slope.
Your attributes will determine if your model has any chance of predicting anything, If you select attributes that are basically random, then your prediction will be random as well. This goes for almost all, if not all technical indicators that are commonly found in trading software. There is no edge there and very little from price momentum in general. If it was that easy to use a few indicators as attributes and then plug in a random forest and all of sudden make money, then everyone would do it. You will have to dig deeper to find your edge.
Random Forest can work well with the right data set but my preference will always be to go directly to neural nets for this kind of job which can span from extremely simple to extremely complex. Getting results from them is not plug and play.
You should read three books....
1. Advances in Financial Machine Leaning by Marcos Lopez de Prado.
2. Machine Learning for Algorithmic Trading by Stefan Jansen
3. Any decent book on probability
There are many things you can achieve with machine learning and finance. The problem is that price prediction is at the bottom of the list and might not even be necessary. Trading is a reactionary business where regimes change constantly. Look to build a process that helps you identify what is happening at any one time. You will have better results in the long run.
regards,
Alex