SOLVED
Contributor II

# Performing Math calculations on healthcare dataset

[ Edited ]

Hi

I am new to RM Studio. I am a physician and I have a dataset of patients that are diabetic as well as non-diabetics, ages, gender etc. The other columns specify lab test result values for several lab tests like fasting sugars etc etc.

Before I proceed to any data modelling, I need to run some basic excel work on it like % of Diabetics based on gender, age and some lab tests. This is fairly easy to do in excel where i can run some formulas / pivot tables and get the values...

Is there any way i could these calculations in RM Studio ??

Thanks

Arsalan

#### See more topics labeled with:

2 ACCEPTED SOLUTIONS

Accepted Solutions
RMStaff
Solution
Accepted by topic author arsalan_karim
‎02-04-2017 01:07 PM

## Re: Performing Math calculations on healthcare dataset

[ Edited ]

Hi,

sure. In your case you want to use a Aggregate operator to generate e.g. count(diabetis) with group by Gender. Maybe you want to combine it with some Generate Attributes to generate new coloums with an excel like interface. Keep in mind that Generate Attributes is doing formulas on a line by line basis and never mixes lines (called examples). If you need averages, std_devs etc on a coloumnar basis, it's usually aggregate.

~Martin

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
Elite III
Solution
Accepted by topic author arsalan_karim
‎02-04-2017 01:08 PM

## Re: Performing Math calculations on healthcare dataset

As Martin said, Aggregate will be useful if you want to do calculations for a "column" (or attribute in RapidMiner terms) across multiple "rows" (called examples in RapidMiner).  Aggregate has built in functions for count(percentage) or average, or other common calculations.

Generate Aggregation is another helpful operator if you want to do transformations across multiple attributes within the same example.  It has similar functions as Aggregate but only works within examples.

Brian T., Lindon Ventures - www.lindonventures.com
Analytics Consulting by Certified RapidMiner Analysts
4 REPLIES
RMStaff
Solution
Accepted by topic author arsalan_karim
‎02-04-2017 01:07 PM

## Re: Performing Math calculations on healthcare dataset

[ Edited ]

Hi,

sure. In your case you want to use a Aggregate operator to generate e.g. count(diabetis) with group by Gender. Maybe you want to combine it with some Generate Attributes to generate new coloums with an excel like interface. Keep in mind that Generate Attributes is doing formulas on a line by line basis and never mixes lines (called examples). If you need averages, std_devs etc on a coloumnar basis, it's usually aggregate.

~Martin

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
Elite III
Solution
Accepted by topic author arsalan_karim
‎02-04-2017 01:08 PM

## Re: Performing Math calculations on healthcare dataset

As Martin said, Aggregate will be useful if you want to do calculations for a "column" (or attribute in RapidMiner terms) across multiple "rows" (called examples in RapidMiner).  Aggregate has built in functions for count(percentage) or average, or other common calculations.

Generate Aggregation is another helpful operator if you want to do transformations across multiple attributes within the same example.  It has similar functions as Aggregate but only works within examples.

Brian T., Lindon Ventures - www.lindonventures.com
Analytics Consulting by Certified RapidMiner Analysts
Highlighted
Moderator

## Re: Performing Math calculations on healthcare dataset

There's also the Generate Attributes operator that let's you simple and complex math operations and create new columns.

Contributor II

## Re: Performing Math calculations on healthcare dataset

Thanks. That was very helpful information.

It helped me solve the problem..

Arsalan