Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
"Logistic Regression: Missing cut-off and intercept settings?"
Hi all
I am posting this without an example because it holds for any application of the myKLR-based Logistic Regression operator, in my opinion.
I am aware of the fact that kernel logistic regression does not follow the usual maximum likelihood training approach of PASW, SAS, R, etc. Nevertheless, there appears to be a convention of what reviewers in many academic journals expect to see when Logistic Regression models are reported, and I am wondering how to obtain these results in RM. I am specifically not referring to any goodness-of-fit / pseudo R-squared values as I know these are not implemented on purpose. Specifically, I am interested in
- the cut-off value, which sets a threshold on confidence values to obtain class labels. I believe these are used to obtain the ROC curve in the performance operator, but I do not see how to set / read them.
- the intercept, or constant in the logistic regression equation. In my opinion, the intercept is a crucial parameter which largely determines the sample selection bias of a logistic regression classifier. For some applications it has to be fixed to zero.
Furthermore, many reviewers demand a forward / backward variable selection based on significance (p-values) and NOT based on overall model performance, and it would be helpful to know whether these (or a substitute) is obtainable in RM.
Thanks for any ideas / suggestions!
I am posting this without an example because it holds for any application of the myKLR-based Logistic Regression operator, in my opinion.
I am aware of the fact that kernel logistic regression does not follow the usual maximum likelihood training approach of PASW, SAS, R, etc. Nevertheless, there appears to be a convention of what reviewers in many academic journals expect to see when Logistic Regression models are reported, and I am wondering how to obtain these results in RM. I am specifically not referring to any goodness-of-fit / pseudo R-squared values as I know these are not implemented on purpose. Specifically, I am interested in
- the cut-off value, which sets a threshold on confidence values to obtain class labels. I believe these are used to obtain the ROC curve in the performance operator, but I do not see how to set / read them.
- the intercept, or constant in the logistic regression equation. In my opinion, the intercept is a crucial parameter which largely determines the sample selection bias of a logistic regression classifier. For some applications it has to be fixed to zero.
Furthermore, many reviewers demand a forward / backward variable selection based on significance (p-values) and NOT based on overall model performance, and it would be helpful to know whether these (or a substitute) is obtainable in RM.
Thanks for any ideas / suggestions!
Tagged:
0