Newbie question - which model?

cvhcvh Member Posts: 5 Contributor II
edited November 2018 in Help
Good day all,

I am new to both data mining and RapidMiner. I'd be grateful for thoughts on which model(s) to use to generate a prediction set where the training set has the following elements:

Element A: a positive real number with two decimal places
Element B: a positive real number with two decimal places
Element C: a positive integer

The prediction should the result of
(A/B) * C
I am using a small dataset of just 20 rows to get a grasp on the issue before going larger.

Thank you!

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    if you already know the target formular, just use the Generate Attributes operator :)

    If you don't, I would suggest using the linear regression in combination with a Generate Function Set operator with use mult and use div checked. If you apply the linear regression after this, the result should look similar to what you noted below.

    Greetings,
      Sebastian
  • cvhcvh Member Posts: 5 Contributor II
    Thanks for your reply, Sebastian. Please forgive my basic questions, I'm finding it a bit difficult without the 5.0 documentation in English (which I would otherwise read carefully before posting!  ;) ).

    In my repository, Retrieve is my example set and Retrieve (2) is my prediction set. If I am to use the Generate Attributes operator alone, which ports should be connected?

    Thank you.

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    please connect the output ports of the retrieve operator to the input port of the generate operator. It will add a new attribute to this example set following the formula you entered. Which example set you will connect depends on what you are going to do. If you want to test the performance of this setting, use an  example set, where you have the real label available for comparing it to the prediction using a Performance operator.

    Greetings,
    Sebastian
  • cvhcvh Member Posts: 5 Contributor II
    Thank you Sebastian, I have now been able to move forward.

    I am now trying the Apply To Test Set template to gain a better understanding. However, it seems I need to use the Attributes Editor as I need to tell RapidMiner that my label column is of type real, not nominal. However, I cannot find the Attributes Editor in RM5.0? I saw this post which indicated that this feature was still under development in RM5.0:

    http://rapid-i.com/rapidforum/index.php/topic,1615.0.html

    Regards,

    Clive
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Clive,
    you could use the Parse Numbers operator instead than using the attribute editor, that's still unavailable. This will parse a given set of attributes and create numerical ones from the nominal values if possible.

    Greetings,
      Sebastian
Sign In or Register to comment.