🦉🦉   WOOT WOOT!   RAPIDMINER WISDOM 2020 EARLY BIRD REGISTRATION ENDS FRIDAY DEC 13!   REGISTER NOW!   🦉🦉

Classification accuracy (stacking)

keesloutebaankeesloutebaan Member Posts: 2 Newbie
edited November 29 in Help
Hey there,

I am currently working on a polynomial classification project. The goal is to reach the highest possible accuracy.
I found out that the 'deep learning' and the 'gradient boosted trees' operator work really well.
Now, I want to find out if stacking can improve the performance. However, I tried a few combinations but every time, the performance drops.
Can someone maybe tell me if there are any important rules to take into account when it comes to stacking? When is it helpful and what settings are then required?
Thanks a lot

Best Answer

Answers

  • keesloutebaankeesloutebaan Member Posts: 2 Newbie
    edited November 29
    Thanks so much, that saves me a lot of time! In that case, I will try to improve my GBT by tuning the parameters. Also, I noticed that the use of 'bagging' improves the performance. Do you maybe know what are the most important GBT parameters to play with? I started with number of trees, but there are a lot more.
  • BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 340   Unicorn
    Hi,
    as with any tree method, you can apply prepruning and postpruning.
    Prepruning applies to the decision before creating a new split. maximal depth and min rows would restrict these, giving you a less complex (and maybe less overfitted) tree.
    Postpruning is deciding after a split has been created. min split improvement would apply a statistical test on each split result and decide if it was worth it. This again reduces the tree complexity.
    That said, GBT (like random forest) is meant to reduce the overfitting problem of decision trees, so it is entirely possible that your model won't become better by changing these settings. (Because it's already coping well with possibly overfitted trees.)
    For other options, see the documentation. They might be very data dependent.
    Regards,
    Balázs
    sgenzer
Sign In or Register to comment.