RapidMiner

Looking for Linear Regression Help

Contributor

Looking for Linear Regression Help

Currently I'm in a class on Data Mining and for a project I am trying to use Rapidminer to create multiple linear regression models based on data from Steam.

I have two excel sheets: one listing 80 games and acting as my training data, and another list of 20 games.

Both lists contain the game's name, price, current owners, current players in the past two weeks, median play time (in seconds), score (1=generally positive review scores, 0 = bad scores).

 

I am attempting to figure out if I can predict whether a game will be popular or review positively based on price, number of owners, and the number of players.

Or

Predict number of players by the price, owners, and ratings.

 

Tried going off of other linear regression model tutorials I've seen online, but couldn't quite figure out if they're right for my particular case.

Any advice would be greatly appreciated.

 

I attached my work so far.

Attachments

2 REPLIES
Moderator

Re: Looking for Linear Regression Help

If the outcome label is one or zero, then you are looking to do classification task. Not sure if you are supposed to use linear regression as part of your schoolwork, but I would also investigate other algorithms like decision tree, neural nets, and even SVM
Highlighted
RM Certified Expert

Re: Looking for Linear Regression Help

Given that your label variable is actually binary, I would recommend logistic regression rather than linear regression.  It was developed specifically for this type of label and addresses a number of conceptual limitations with linear regression in such cases.  And it also does allow you to use other algorithms entirely as @Thomas_Ott describes.

 

Brian T., Lindon Ventures - www.lindonventures.com
Analytics Consulting by Certified RapidMiner Analysts