[SOLVED] Standard Deviation

josh321josh321 Member Posts: 9 Contributor II
edited November 2018 in Help
I'm aware of operators for average, min, max.. etc.. But I see none for standard deviation. I'm trying to filter data to include only values that are within 3 standard deviations from the mean for a given attribute. How is the best way to go about this in Rapidminer?

Thanks,
Josh

Best Answer

  • dan_agapedan_agape Member Posts: 106 Maven
    Solution Accepted
    Hi Josh,

    Use Generate Attribute to make a copy of the given attribute (assume C is the new attribute), then use Normalize to modify the values of C using the Z-transformation method, and then use Filter Examples to keep  only the rows for which the values of C are between -3 and 3 in the dataset. Finally you can discard the attribute C.

    Dan 

Answers

  • fritmorefritmore Member Posts: 90 Contributor II
    hi
    u can use generate attribute operator to implement the STD formula and then use filter examples operator.

    Or try the operator Weight by deviation.
  • josh321josh321 Member Posts: 9 Contributor II
    Hi. Thanks for the reply, but I'm not sure I understand. I've tried the weight by deviation operator, but it appears to weight entire attributes against the data set, rather than a sample vs the attribute mean. And I'm not sure how to implement a STD formula that results in a standard deviation.

    Thanks,
    Josh
  • josh321josh321 Member Posts: 9 Contributor II
    That did the trick, thanks!
Sign In or Register to comment.