Options

# Logistic Regression

Has any one got a clue how to run logistic regression on Titanic Dataset? I've tried this literally all day but i don't think im getting the right accuracy so i must be missing a step. In Set Role my attribute name is Sex, in Split Data my ratio is 0.1 and 0.1 for the two partitions and i'm getting 64.53 accuracy - same test is ran on Orange and it was 91.7%

Screenshot attached.

Screenshot attached.

Tagged:

0

## Answers

1,207UnicornJust for clarification. You are trying to predict attribute Sex from Titanic data set. Then you split the data into 90:10 ratio (train:test). Then you applied logistics regression and found that you got an accuracy of 64.53 percent on test data. Am I correct?

I tried similar to what you did and got 35 percent accuracy. It depends on how your data was split. I assume that you are not changing the settings in Logistic Regression in RapidMiner; my results are with default settings. I am not sure what the settings in Orange software you were using for logistic regression. Are the settings in both software for logistic regression same?

Also, Random (90:10) split is not recommended to compare performance as the train, and test data vary when you do it multiple times (my results are an example for this). You need to use cross validation with either 5 fold or 10 fold to test the performance of an algorithm. Also, the settings should be the same when you want to compare different software or algorithms.

Thanks

Varun

https://www.varunmandalapu.com/

Be Safe. Follow precautions and Maintain Social Distancing291UnicornI suggest that you go thru all the parameters of logistic regression and understand their meaning (there are quite a few!). Help section for the operator explains them quite well.

I have reproduced exactly the same process with the following parameters of logistic regression and got 80,92% accuracy 'out of the box', see below.

Otherwise it's hard to tell not knowing your parameters settings (also I have no idea how Orange sets up logistic regression by default).

Vladimir

http://whatthefraud.wtf

2NewbieThanks for your input. Initially i actually didn't make any changes to the regression parameters but having replicated your parameter setting the accuracy is now 76.76%.

Note: I'm very new to RapidMiner and Data Analytics in general so i'm not to familiar with the parameters and what they should actually be so I'm currently researching this for report purposes. Attached is a screenshot of my current parameter.

1,635UnicornLindon Ventures

Data Science Consulting from Certified RapidMiner Experts