Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Logistic Regression
Has any one got a clue how to run logistic regression on Titanic Dataset? I've tried this literally all day but i don't think im getting the right accuracy so i must be missing a step. In Set Role my attribute name is Sex, in Split Data my ratio is 0.1 and 0.1 for the two partitions and i'm getting 64.53 accuracy - same test is ran on Orange and it was 91.7%
Screenshot attached.
Screenshot attached.
Tagged:
0
Answers
Just for clarification. You are trying to predict attribute Sex from Titanic data set. Then you split the data into 90:10 ratio (train:test). Then you applied logistics regression and found that you got an accuracy of 64.53 percent on test data. Am I correct?
I tried similar to what you did and got 35 percent accuracy. It depends on how your data was split. I assume that you are not changing the settings in Logistic Regression in RapidMiner; my results are with default settings. I am not sure what the settings in Orange software you were using for logistic regression. Are the settings in both software for logistic regression same?
Also, Random (90:10) split is not recommended to compare performance as the train, and test data vary when you do it multiple times (my results are an example for this). You need to use cross validation with either 5 fold or 10 fold to test the performance of an algorithm. Also, the settings should be the same when you want to compare different software or algorithms.
Thanks
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
I suggest that you go thru all the parameters of logistic regression and understand their meaning (there are quite a few!). Help section for the operator explains them quite well.
I have reproduced exactly the same process with the following parameters of logistic regression and got 80,92% accuracy 'out of the box', see below.
Otherwise it's hard to tell not knowing your parameters settings (also I have no idea how Orange sets up logistic regression by default).
Vladimir
http://whatthefraud.wtf
Thanks for your input. Initially i actually didn't make any changes to the regression parameters but having replicated your parameter setting the accuracy is now 76.76%.
Note: I'm very new to RapidMiner and Data Analytics in general so i'm not to familiar with the parameters and what they should actually be so I'm currently researching this for report purposes. Attached is a screenshot of my current parameter.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts