The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.

Creating Charts for Performance Measurement Parameters

ozgeozyazarozgeozyazar Member Posts: 21 Maven
edited June 2019 in Help
Hi All !,

I have 2 phases question which I need advice. I have imbalanced data set and according to literature, accuracy is not accurate parameter for these kind of data for performance measurement of classification algorithm. 

I applied parameter optimization for decision tree algorithm with both cross validation for binomial and classification performance. (I am currently using three types feature selection algorithm and try to find out their effect on performance of classification). firstly, ı need to visualize roc curve which shows different results for same algorithm (for example fist graph will indicate result of DT main creation: accuracy, depth:100; confidence:0,050 and another result DT main reation: gain ratio, depth:50; confidence:0,200). 

Second question is, how I can compare with graphics other parameters like kappa, NPV, PPV ..

I hope I clearly express my question. 

Your valuable contribution will highly appreciated.



  • Options
    SGolbertSGolbert RapidMiner Certified Analyst, Member Posts: 344 Unicorn

    I think I understand your question. The part about accuracy in inbalanced dataset refers to the fact that you can have a very good accuracy value but have a horrible recall for the minority class. In that case measures like the f-score could be better, but there is no universal best performance measure.

    During the optimization you can log all of the performance measures that you use, you have to activate the "log all criteria option":

    You can then decide which parameter combination is best using a combination of criteria (not an easy task, as it is a multicriteria optimization and there may be multiple "optimal" points).

    Regarding the visualization, you can maybe plot 3 columns at the time, but as in other multivariate problem there is no universal visualization to represent the whole results.

    Let me know if it helps, kind regards

  • Options
    ozgeozyazarozgeozyazar Member Posts: 21 Maven
    Dear @SGolbert

    many thanks for your time and your recommendation. 

    Additionally, I need to create a recall and precision plot as well. Hope there is a node which will help me to create this plot. 

    Özge Özyazar

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    The precision and recall plot is directly available in the output of the Performance operator:

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.