[SOLVED] Average/deviation by group

andreisterandreister Member Posts: 2 Contributor I
edited November 2018 in Help
I have a data set  where one of the columns is category. And I want to calculate the mean and standard deviation of some other column, but separately for each category.

Ie., for input like

GroupName...Value
A.......................1
A.......................3
B......................1
B.......................5

I want the output like

GroupName...Value...Mean...StdDev
A.......................1...........2...........1.44          
A.......................3...........2...........1.44
B.......................1...........3...........2.8
B.......................5...........3...........2.8


I know how to get the group means and standard deviations via Aggregate operator, but I dont know how to add the new columns to the original dataset. What am I missing?

Thanks!

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Add an ID attribute before the Aggregate operator, then join the results of Aggregate to the original data using the Join operator and the id attribute.

    Best regards,
    Marius
  • andreisterandreister Member Posts: 2 Contributor I
    Ok  great - Join operator was the missing part.

    I put "inner" join and selected my grouping attribute for both left and right subsets, works like a charm.
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi, uncheck the parameter and join by the group attributes instead.
Sign In or Register to comment.