image

🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉

Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.

CLICK HERE TO DOWNLOAD

prediction accuracy for clustering

komeil_shaerikomeil_shaeri Member Posts: 13 Contributor II
edited September 2019 in Help

Hi everyone, 

I wondering how I can get "prediction accuracy" for different clustering techniques in RapidMiner? Because the default performance measures such as "cluster distance performance" or "cluster density performance" do not provide prediction accuracy. 

 

Thanks in advance,

Best Answer

  • komeil_shaerikomeil_shaeri Member Posts: 13 Contributor II
    Solution Accepted

    I found the solution: "Map Clustering on Lables" 

    Now, I have managed to get accuracy, precision and recall for k-means ....

     

    Thanks

    mschmitzFtoon_Kedwan

Answers

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,053  RM Data Scientist

    Hi,

     

    how do you define accuracy if there is no label like in usual clustering problems?

     

    ~Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • komeil_shaerikomeil_shaeri Member Posts: 13 Contributor II

    For example, in my dataset I have 11 normal attributes and 1 lable. In classification, we get prediction accuracy based on the confusion matrix. For clustering, there is no such thing but I need a prediction performance as many papers comparing their clustering algorithms based on this measure. Fo instance, the attached image shows one of these papers which provides predicttion performance for expectation maximization and k-means. 

     

    Thaks

    222.PNG 33.2K
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,053  RM Data Scientist

    Hi,

     

    in usual clustering problems you don't have a label. How do you want to assign a cluster-id to a class?

     

    ~Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • komeil_shaerikomeil_shaeri Member Posts: 13 Contributor II

    Hi, 

    This how MATLAB defines the clustering performance. MATLAB provides confusion matrix for the both classification and clustering models. Please see page 5 and 6 in the attached pdf. I need the same thing in Rapidminer environment. 

     

    Thanks

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,053  RM Data Scientist

    Wow, good catch. That's an operator I did not come across in my last 6 years.


    ~Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
    komeil_shaeriFtoon_Kedwan
  • Fred12Fred12 Member Posts: 344   Unicorn

    would it not simply be possible to use Performance ( Classification) Operator after apply model operator to see if a cluster matches a certain class in x-times of cases?

  • vand_boo99vand_boo99 Member Posts: 1 Contributor I

    i always get this text message when im try to show my precision help please

  • sgenzersgenzer 12Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959  Community Manager

    hello @vand_boo99 welcome to the community! I'd recommend posting your XML process here (see https://youtu.be/KkgB5QXWXJ8 and "Read Before Posting" on right when you reply) and attach your dataset. This way we can replicate what you're doing and help you better.

    Scott

Sign In or Register to comment.