Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Lift Charts - Improvements in lower deciles
I know that there have been a number of discussions here on lift charts, including this one, https://community.rapidminer.com/discussion/55773/about-lift-chart, but I have to admit I am wondering why in so many of my examples (different datasets, different techniques), the Lift chart (or simple lift chart) output is showing situations where the hit rate/conversion actually goes up in deciles farther to the right. By definition, the data are sorted on confidence of the target class, descending, and you would normally see the hit rates drop with each decile, as I did with the same dataset/technique in a different tool.
Even in the example I linked to above, the hit rate actually goes up in decile 6. Admittedly I very rarely see this, so I am wondering if there is an explanation or an intuition you can share why this appears so often here in RM.
Above, the results are from a logistic regression.
Last but not least, is there a way to set a reference line on these charts to show the baseline % of the target? I think that would really simplify the visualization for people to understand the concept of lift.
Even in the example I linked to above, the hit rate actually goes up in decile 6. Admittedly I very rarely see this, so I am wondering if there is an explanation or an intuition you can share why this appears so often here in RM.
Above, the results are from a logistic regression.
Last but not least, is there a way to set a reference line on these charts to show the baseline % of the target? I think that would really simplify the visualization for people to understand the concept of lift.
Tagged:
0
Best Answer
-
IngoRM Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM FounderHey there,1) Simple LC vs. LC: I really do recommend to use the newer Lift Chart (Simple) version, the other one is kind of unstable when the thresholds value are very close together. This often leads - like in your example - to cases where you will get not the desired number of buckets. This most often happens for smaller data sets (like in your case with the 283 examples) and / or with models which produce only a limited set of discrete confidence values (like for example decision trees).2) Reference line: this is currently not possible but may be a good idea indeed. There is some risk that the charts gets even busier but definitely worth a try.
3) Change of slope: this can indeed happen, especially (like above) for smaller data sets and / or for models with a limited set of confidence values, e.g. decision trees. I know that some tools sometimes "cheat" to avoid this in their visualization but I personally rather prefer to see this TBH. And as above, it is much less likely to happen for larger data sets and for models like Naive Bayes and others which produce more fine-grained confidence values.Hope this helps,
Ingo5
Answers
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts