Letting an operator in RM know the coherences of variables?

Fred12Fred12 Member Posts: 344 Unicorn
edited November 2018 in Help


my dataset has several attributes  that are constructed from other attributes...

e.g I have radius and diameter, circumference  and area of something, and those values basically can be calculated from radius alone, and therefore it just contributes as an additional weighted attribute to the dataset (besides the base attribute,radius)

is there any way to show some operators the correlations between some attributes and express those coherences as a formula or so for the operator?

therefore, it could take these correlations into account and give better results or select more relevant features for some operators... 

can Rapidminer do something that intelligent?


  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,517 RM Data Scientist

    What would be the purpose of it? Letting it the learner know? The learner is not only using correlations but all dependecies. Good learners will incorperate those dependecies. That's the trick.



    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Adding to what @mschmitz said, the machine learning algorithms already take into account the inter-relationships between your features. You can, of course build a Correlation matrix using either the Correlation Matrix operator (read the help menu for the applied Correlation Matrix formula) and then export those feature weights.  Then using a Select By Weights operator you can select the Top 5 or 10 features that you can then input into another machine learning algorithm.


    Another suggestion is to investigate the "Weight By ..." operators, they will use an algorithm to determine how heavily a feature will influence your target label. There are some great ones such as Weight By SVM, Weight By Tree, or Weight by Relief. All worth investigating.



  • Options
    Fred12Fred12 Member Posts: 344 Unicorn

    yeah, they are all worth investigating, but that's the problem...

    I get a bunch of different selections of attributes, every time I try a different weighting or Feature selection algorithm for my attribute, which one of the selections should I choose therefore? there are plenty of possibilities, and it's hard to try all of the combinations  out on my learner...

    or is that somehow possible? some integrated approach to try out all weightors/selectors on one model or several models and choose those that  have the best performance (accuracy)? 

Sign In or Register to comment.