RapidMiner

RapidMiner

[SOLVED] Average/deviation by group

Contributor

[SOLVED] Average/deviation by group

I have a data set  where one of the columns is category. And I want to calculate the mean and standard deviation of some other column, but separately for each category.

Ie., for input like

GroupName...Value
A.......................1
A.......................3
B......................1
B.......................5

I want the output like

GroupName...Value...Mean...StdDev
A.......................1...........2...........1.44          
A.......................3...........2...........1.44
B.......................1...........3...........2.8
B.......................5...........3...........2.8


I know how to get the group means and standard deviations via Aggregate operator, but I dont know how to add the new columns to the original dataset. What am I missing?

Thanks!
3 REPLIES
Super Contributor

Re: Average/deviation by group

Add an ID attribute before the Aggregate operator, then join the results of Aggregate to the original data using the Join operator and the id attribute.

Best regards,
Marius
Contributor

Re: Average/deviation by group

Ok  great - Join operator was the missing part.

I put "inner" join and selected my grouping attribute for both left and right subsets, works like a charm.
Super Contributor

Re: Average/deviation by group

Hi, uncheck the parameter and join by the group attributes instead.