Compare attribute columns based on value ranges?

Fred12Fred12 Member Posts: 344 Unicorn
edited November 2018 in Help


I want to compare the values from 2 attribute columns from 2 different excel files.. e.g radius1 and radius2,

now I want to "identify" those as equal (meaning, their ID is the same) if they are equal in a certain range, e.g radius1 = 1.77 and radius 2 = 1.78


like in a formula: if radius1 = between 1.02*radius2 and 0.98*radius2, then its equal!

then I want to join all the rows based on that equal row entries if it matches above formula.


is it somehow possible to identify equality based on ranges like above?


  • BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn



    If you don't have too much data, you could do a Cartesian Join, then use Generate Attributes for calculating the difference and then Filter Examples for only keeping the examples with a small difference.


    If your example sets have many lines, Cartesian Join will create a huge data set. In that case, you might want to try this Generic Join approach with the built-in scripting:





  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    If you are only interested in casewise comparison of radius1 and radius2 values, then @BalazsBarany method works equally well without the Cartesian join--just use generate attribute to calculate the difference and filter those that meet your threshhold.  But if you do want a pairwise comparison of all possible combinations of radius1 and radius2, I hope you have a small dataset!  The combinations inflate pretty quickly :-) .



    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.