Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Cannot execute log reg calibration learning: Error while training the H2O model: Illegal argument(s)
SabaMomeniKho
Member Posts: 5 Learner III
Hello,
I'm using auto model in rapidminer 9.5 for a crash dataset. the task is prediction and the "class" column is the target. I chose decision tree, naive bayes, gradient boosted trees, random forest, svm and deep learning. After running, the process only shows results for naive bayes and decision trees and the others face the error below:
I'm using auto model in rapidminer 9.5 for a crash dataset. the task is prediction and the "class" column is the target. I chose decision tree, naive bayes, gradient boosted trees, random forest, svm and deep learning. After running, the process only shows results for naive bayes and decision trees and the others face the error below:
Cannot execute log reg calibration learning: Error while training the
H2O model: Illegal argument(s) for GLM model: ERRR on field: _response:
Response cannot be constant.
As I'm new to this software and I should use it for my Msc thesis, I really need help with this problem. I have also attached my data in case you needed to see.
Thank you.
As I'm new to this software and I should use it for my Msc thesis, I really need help with this problem. I have also attached my data in case you needed to see.
Thank you.
Tagged:
0
Best Answers
-
lionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 UnicornHi @SabaMomeniKho,
It's a known issue from the RM staff. It is due to the fact that your label has (very) minority classes :
There are 2 workarounds :
- First try to group your 2 minority classes ("majorinjury" and "fatal") in a unique class (called for example "other injuries"). You can do that with the Replace Rare Values operator which is part of the Toolbox extension (to install from the marketPlace).
- if it does not work, filter out this minority classes from your dataset.
Hope this helps,
Regards,
Lionel6 -
lionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn@SabaMomeniKho,
OK, I understand.
Yes, good idea : You can apply these predictors separately in the design view.
With your highly imbalanced dataset, I think you can present 2 strategies :
1. No data preprocessing :
Please look at the Process_1.rmp in attached file and its results.
Given you have very few examples of your minority classes (minorinjury, majorinjury, fatal), without data preprocessing, the used algorithm(s) have difficulties to establish / to "captur" the relationships between your regular attributes and these minority classes of your label. As the results, you have effectively a relativ good accuracy, because your algorithm(s) are predicting (quasi) only the majority class (in your case "pdo"). But the cons of this strategy is that the recall of your minority class are extremely bad (very close to 0 or 0), that is to say that the capacity of your model to correctly predict the minority classes is very bad :
2. Data preprocessing :
Please look at the Process_2.rmp in attached file and its results.
If your priority goal is to correctly predict one of your 3 minority classes deservedly (contribute to better road safety is a noble task, congratulations ! ), you have to upsample the minority class you want to correctly predict, meaning that you have to "artificially increase" the number of observations of this minority class. For that you can use the SMOTE Upsampling operator (part of Toolbox Extension to install from the MarketPlace). In the parameters of this operator, uncheck auto detect minority class and set the name of the minority class you want to predict, for example "fatal".
As the results, the class recall of the studied minority class is significantly than in the first strategy; meaning that your model is now able to correctly predict one of your minority class (for example "fatal"). The cons of this strategy is that your overall accuracy will decrease :
Next steps :
To enhance the performance of your model(s) , you can introduce the concepts of :
- Parameters optimization (via the Optimize Parameters (Grid) operator)
- Feature selection (via the Automatic Feature Engineering / Apply Feature Set operators)
To help you with these concepts, you can go to the RapidMiner Academy where there are plenty pedagogic videos :
https://academy.rapidminer.com/
Don't hesitate to comeback if you have other questions during your thesis...
Regards,
Lionel
PS : For my general culture, what is the meaning of "pdo" (the majority class of your label). Thanks you...
6
Answers
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
Regards,
Lionel