Performance (Classification) question

tonyboy9tonyboy9 Member Posts: 113 Contributor II
edited September 2020 in Help
Please note the left and right sides of my credit card data set for clustering:





Next comes the process, interrupted too long by Performance error messages. This is the latest bad thing: Missing label. Input Example Set does not have a label attribute. I checked the Set Role operator Parameters has an Attribute that is a Label.



I ran once more. Now I get Non-nominal label. The label attribute (Purchases_Frequency) must be nominal for the calculation of performance criteria for classification tasks.

I really want to see this through, to prove Performance (Classification) is worth all this time. Please send me some helpful suggestions. Thanks for your time.
Tony



  

Answers

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    I would add a break point right before that operator and check that the clustering operator has not modified your roles.  If so then you can simply add another Set Role just before the Performance operator to set things as they are needed.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • tonyboy9tonyboy9 Member Posts: 113 Contributor II
    Okay, Brian, I'm new to breakpoint, and I think I got lost here. I went to Performance, installed a Breakpoint before.



    Then tried Shift + F7, nothing. Then clicked on the Run button and got this in Results. 



    Where did I go wrong? Thanks.

    Tony
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    It doesn't look to me like you have a nominal role at this point (your green column looks like it is numerical).  So just add another Set Role before that Performance (Classification) operator and set the appropriate attribute as the role of label, and make sure the data type is nominal.
     
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • tonyboy9tonyboy9 Member Posts: 113 Contributor II
    Brian, can we go back to the RapidMiner definition of the k-means operator I'm trying to use to divide credit card customers into groups?

    As no Label Attribute is necessary, Clustering can be used on unlabelled data and is an algorithm of unsupervised machine learning. The k-means algorithm determines a set of k clusters and assigns each Examples to exact one cluster. The clusters consist of similar Examples.

    Somehow I think I need to have a label attribute, so in Set Role I chose Purchases Frequency as label. Is this not contrary to the purposes of a k-means operator, to be labeled?

    Now to make Performance work, I need another Set Role with a different attribute which is nominal, not Purchases Frequency (green column) which is numeric.

    The only way I understand this is to go back into TurboPrep to change Purchases Frequency attribute type to other than 'Number.'

    Thanks for your time.  




     
Sign In or Register to comment.