Rule Induction Model Results

Panda_PiePanda_Pie Member Posts: 1 Contributor I
edited November 2018 in Help
HI

I'm new to Rapid Miner and Data Mining in general. I'm using Rapid Miner 5 and I'm having a problem interpreting the results of the Rule Induction Model. Below is a section of the results


RuleModel
if marital-status = Never-married then <=50K  (1304 / 52)
if education-num = 10.500 and sex = Female and relationship = Unmarried then <=50K  (188 / 7)
if education-num = 10.500 and capital-gain = 4225 and relationship = Not-in-family then <=50K  (267 / 27)
if marital-status = Married-civ-spouse and age > 27.500 then >50K  (47 / 65)

I've highlighted 2 of them to better illustrate what I'm asking...
I'm basically just a little confused about what the numbers actually mean. I've noticed that when the result is <=50K the numbers are always (high/low) (like in the first highlighted result), and when the result is >50K the numbers are always (low/high) (second highlight).

At first I thought that maybe it could mean, for example with the first one, that people who are never married earn <=50K, and this was true for 52 of the 1304 people sampled... But that wouldn't make sense for the second one because the number are switched.

Any clarification on this would be greatly appreciated.

Thanks very much

Noel

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Noel,
    these numbers indicate the class distribution after applying this rule. So in your case the first number indicates the number of examples belonging to class <=50K and the second to >50K. We don't use the correct/wrong separation here, because in the case you have more than 2 classes this way not all information would be displayable, because sometimes it is important which classes get intermixed with one other.

    Greetings,
      Sebastian
Sign In or Register to comment.