Looking for Linear Regression Help

radlerradler Member Posts: 1 Contributor I
edited November 2018 in Help

Currently I'm in a class on Data Mining and for a project I am trying to use Rapidminer to create multiple linear regression models based on data from Steam.

I have two excel sheets: one listing 80 games and acting as my training data, and another list of 20 games.

Both lists contain the game's name, price, current owners, current players in the past two weeks, median play time (in seconds), score (1=generally positive review scores, 0 = bad scores).

 

I am attempting to figure out if I can predict whether a game will be popular or review positively based on price, number of owners, and the number of players.

Or

Predict number of players by the price, owners, and ratings.

 

Tried going off of other linear regression model tutorials I've seen online, but couldn't quite figure out if they're right for my particular case.

Any advice would be greatly appreciated.

 

I attached my work so far.

Tagged:

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
    If the outcome label is one or zero, then you are looking to do classification task. Not sure if you are supposed to use linear regression as part of your schoolwork, but I would also investigate other algorithms like decision tree, neural nets, and even SVM
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    Given that your label variable is actually binary, I would recommend logistic regression rather than linear regression.  It was developed specifically for this type of label and addresses a number of conceptual limitations with linear regression in such cases.  And it also does allow you to use other algorithms entirely as @Thomas_Ott describes.

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.