RapidMiner 9.7 is Now Available

Lots of amazing new improvements including true version control! Learn more about what's new here.

CLICK HERE TO DOWNLOAD

"Brute force feature selection"

ammarghammargh Member Posts: 27  Maven
edited June 2019 in Help
Shouldn't brute force feature selection return the best performance? The performance I had using features I have selected manually was better than the performance using features returned from brute force selection component.

Is this normal?

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869   Unicorn
    Hi,

    how big is the difference between your manual performance and the performance of the features found by brute force? Are you using a Cross Validation?

    Please keep in mind that by default the X-Validation always uses random splits, and thus small performance changes can be produced by randomness. To enforce the same splits in all X-Validations in all iterations, and also in your manual evaluation, you can set the local random seed of all X-Validation operators to a constant. Then only the performance of the algorithms and the features is compared, and the factor "random" is eliminated.

    Best regards,
    Marius
  • ammarghammargh Member Posts: 27  Maven
    I see your point.
    I will follow your advise
    Thank you.

  • mafern76mafern76 Member Posts: 45 Contributor II
    It should be impossible for a manual selection to do better than a brute force selection, because the latter simply tries all possible combinations.

    I agree with Marius, your results are biased because of randomness and I might add maybe due to high variation in your model algorithm.
Sign In or Register to comment.