🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉

Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.


What M5, greedy and T-test is meaning

sriongatcsriongatc Member Posts: 1 Newbie
I just try to training model with Linear Regression. I need to know about meaning of M5, greedy and T-test from feature selection. Many thanks for considering my request. :'( 


  • David_ADavid_A Administrator, Moderator, Employee, RMResearcher, Member Posts: 277  RM Research
    edited November 5

    those three are different strategies for reducing the number of features (or attributes or columns) that are considered in your model.
    In general it's a good idea to have as few influence factors as possible for your model, so it's less susceptible for noise and errors. On the other hand, you don't want to lose potential information. So it's always a trade off between selecting the right amount of features.

    M5 is also a called M5 Prime, selects a subset of attributes, which improves the Akaike information criterion the most.
    T-test performs the statistical test of the same name to consider if a feature has a significant influence on the target class.
    Greedy is a forward selection strategy, where each round the attribute with the lowest contribution (again based on the Akaike information criterion) is deselected.

    There's no golden rule which selection strategy gives you the best results, that's best decided with an independent parameter optimization). But I highly recommend to use any kind of feature selection for your regression model (especially if you have more than just a few attributes).


Sign In or Register to comment.