RapidMiner Wisdom Banner

Information gain and numerical attributes

IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,715  RM Founder
edited November 2018 in Help
Original message from SourceForge forum at http://sourceforge.net/forum/forum.php?thread_id=2043728&;forum_id=390413

Hi,

how does RapidMiner handle numerical attributes and information gain calculation for feature seletion? Is every occuring value used or does RM calculate several "bins"?


Answer by Ingo Mierswa:

Hello,

do you refer to the InfoGainWeighting operator or the information gain calculation inside of a decision tree learner?

> Is every occuring value used or does RM calculate several "bins"?

Both is possible. If you discretize the values first with one of the discretization operators, these bins are used. If not, RM tries all possible split points.

Cheers,
Ingo


Answer by topic starter:

Hi,

I was refering to the InfoGainWeighting operator which is used for feature selection.
RapidMiner Wisdom 2020
February 11th and 12th 2020 in Boston, MA, USA

Answers

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,715  RM Founder
    Hi again,

    I was refering to the InfoGainWeighting operator which is used for feature selection.
    Same applies here: if the attributes are already nominal or if they are discretized before, the usual information gain is used. If not, this operator tries all possible split points between two neighbored numbers and selects the split point with the highest gain and delivers the corresponding value.

    Cheers,
    Ingo
    RapidMiner Wisdom 2020
    February 11th and 12th 2020 in Boston, MA, USA

Sign In or Register to comment.