Attribute Generation

LamaaLamaa Member Posts: 1 Newbie
I have a data file to evaluate the performance of the restaurant. And the business question is Which neighborhood has the most attractive restaurants (Cost Vs Rate). Here the neighborhood is the place where the restaurant has placed its service, cost is the average cost of eating there and rate is the rating provided by customer.

So, how can I generate a new attribute by combining cost and rate to find the best neighborhood.

Thank You in Advance


  • Options
    varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    Hello @Lamaa

    You can set a scale initially before coding for a cost vs rate attribute. For example, if the average cost of eating at a restaurant X is 25 Dollars and the rating of that restaurant is 3 out of 5 stars, you can set this as Medium expensive - Medium rated place, you can code it using IF statement in generate attributes operator. Similarly, if the cost is 40 dollar and 4 stars out of 5, you can code it as costly - well-rated place. 

    You first need to come up with a scale based on ratings and average cost, then it is the matter of if statements in generate attribute operator to create an attribute CostVsRate.

    Hope this helps. If you are looking for something else, please provide detail information about your idea and the dataset.

    Be Safe. Follow precautions and Maintain Social Distancing

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    You can also use Discretize by User Specification if you want to create bins out of continuous numerical attributes.  The bin boundaries can be arbitrarily defined by the user in this operator.

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    M_MartinM_Martin RapidMiner Certified Analyst, Member Posts: 125 Unicorn
    Hi Lamaa: In addition to the good suggestions above, it might be a good idea to speak with the business users who will ultimately use the outputs of your model - they may be able to provide you with some guidance re: the boundaries of the various classifications you may create (as suggested by varunm1) or the outputs of descitizing your data (as suggested by Telcontar v120).  This could also have the benfit of your model outputs being more understandable to business users.  Best wishes, Michael Martin
Sign In or Register to comment.