# Rule Induction Model Results

HI

I'm new to Rapid Miner and Data Mining in general. I'm using

RuleModel

if marital-status = Never-married then <=50K

if education-num = 10.500 and sex = Female and relationship = Unmarried then <=50K (188 / 7)

if education-num = 10.500 and capital-gain = 4225 and relationship = Not-in-family then <=50K (267 / 27)

if marital-status = Married-civ-spouse and age > 27.500 then >50K

I've highlighted 2 of them to better illustrate what I'm asking...

I'm basically just a little confused about what the numbers actually mean. I've noticed that when the result is

At first I thought that maybe it could mean, for example with the first one, that people who are never married earn <=50K, and this was true for 52 of the 1304 people sampled... But that wouldn't make sense for the second one because the number are switched.

Any clarification on this would be greatly appreciated.

Thanks very much

Noel

I'm new to Rapid Miner and Data Mining in general. I'm using

**Rapid Miner 5**and I'm having a problem interpreting the results of the**Rule Induction**Model. Below is a section of the resultsRuleModel

if marital-status = Never-married then <=50K

**(1304 / 52)**if education-num = 10.500 and sex = Female and relationship = Unmarried then <=50K (188 / 7)

if education-num = 10.500 and capital-gain = 4225 and relationship = Not-in-family then <=50K (267 / 27)

if marital-status = Married-civ-spouse and age > 27.500 then >50K

**(47 / 65)**I've highlighted 2 of them to better illustrate what I'm asking...

I'm basically just a little confused about what the numbers actually mean. I've noticed that when the result is

**<=50K**the numbers are always**(high/low)**(like in the first highlighted result), and when the result is**>50K**the numbers are always**(low/high**) (second highlight).At first I thought that maybe it could mean, for example with the first one, that people who are never married earn <=50K, and this was true for 52 of the 1304 people sampled... But that wouldn't make sense for the second one because the number are switched.

Any clarification on this would be greatly appreciated.

Thanks very much

Noel

0

## Answers

2,531Unicornthese numbers indicate the class distribution after applying this rule. So in your case the first number indicates the number of examples belonging to class <=50K and the second to >50K. We don't use the correct/wrong separation here, because in the case you have more than 2 classes this way not all information would be displayable, because sometimes it is important which classes get intermixed with one other.

Greetings,

Sebastian