How to convert binominal predictions into numerical?

nudelsuppe113nudelsuppe113 Member Posts: 4 Learner I
I have a Data Set with a binominal label which contains the information if a student has passed a test. I want to predict the actual points of the test by using rapidminer. After logistic regression, all I get is the binominal prediction. How can I convert this into numeric values or am I doing something wrong before?

Thx in advance

Open Spoiler:
So I have the attributes G1 and G2 both having values from 0 to 20 aka the points of the respective test. And in G3 I have just 0 and 1, is there a way to predict the points like in G1 and G2 with the help of the other attributes?


Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist
    why don't you just use the points as labels in first place? That feels more natural if you really need the points.

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    I agree with Martin, the easiest solution here is to make a numerical prediction based on either G1 or G2 or their combination.  
    If you really didn't want to do that for whatever reason, you could use the binominal prediction score (which should still be a number between zero and one) and then use that in a regression to predict either G1 or G2 (or combo) to calibrate your score to points, but that is definitely the long way around and potentially introduces additional error, bias, and variance to your prediction.  

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • nudelsuppe113nudelsuppe113 Member Posts: 4 Learner I
    edited January 2021

    Thanks for the answer @mschmitz!

    I understand your advice but, I don't know what to do with the predicted points of G1 and G2, how can I predict G3?

    Spoiler: description of the set + task

    G3 has the results of the final test, 0 represents 0-10 points, 1 represents 11-20 points.
    G1 and G2 are smaller tests, having no impact on the final grade (e.g. average).

    My task is, to predict whether a student passes or fails G3 and how much points he reaches in that test. 

    So my first asumption was to predict the real number between of G3 0 and 1 and then multiply it with 20 to get the actual points. I guess that's not how it works.


    I guess I'm missing something.


  • nudelsuppe113nudelsuppe113 Member Posts: 4 Learner I
    I need more help here please
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist
    Hi,
    why not take G1 and G2 as a label and predict it?
    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • nudelsuppe113nudelsuppe113 Member Posts: 4 Learner I
    @mschmitz
    okay but how can I predict G3 with the help of them? I mean, do I have calculate G3 with some kind of an equation depending on the weights of G1 and G2? 
    I can't just take the predicted points of G1 or G2 as values of G3.
    Example: if a student hasn't finished G3 successfully, he must have 0-10 points. But if the predicted points of G1 or/and G2 for the same student are like 14, the value of G3 would be wrong. Do I have a mistake in reasoning here, or is there a high accurate way to calculate the points of G3?

    thx in advance
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    You don't need to specify a particular equation in advance.  That depends on the ML algorithm you use for the prediction.  If you use a linear regression, then yes it equates to a single definite linear equation relating G1 and G2 to G3.  But if you use other approaches like neural nets or regression trees then you can have what is in effect a more complex, non singular and non linear relationship between G1 and G2 and G3.

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.