Options

# "How To Interpret the Results of Create Association Rules"

MartinLiebig
Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts:

**3,524**RM Data Scientist## Question

The Create Association Rules Operator is creating various statistical measures on the rules. What does they tell me?

## Answer

The most important criteria are already documented in the operators help

**confidence**: The confidence of a rule is defined conf(X implies Y) = supp(X ∪Y)/supp(X) . Be careful when reading the expression: here supp(X∪Y) means "support for occurrences of transactions where X and Y both appear", not "support for occurrences of transactions where either X or Y appears". Confidence ranges from 0 to 1. Confidence is an estimate of Pr(Y | X), the probability of observing Y given X. The support supp(X) of an itemset X is defined as the proportion of transactions in the data set which contain the itemset.**lift**: The lift of a rule is defined as lift(X implies Y) = supp(X ∪ Y)/((supp(Y) x supp(X)) or the ratio of the observed support to that expected if X and Y were independent. Lift can also be defined as lift(X implies Y) =conf(X implies Y)/supp(Y). Lift measures how far from independence are X and Y. It ranges within 0 to positive infinity. Values close to 1 imply that X and Y are independent and the rule is not interesting.**conviction**: conviction is sensitive to rule direction i.e. conv(X implies Y) is not same as conv(Y implies X). Conviction is somewhat inspired in the logical definition of implication and attempts to measure the degree of implication of a rule. Conviction is defined as conv(X implies Y) =(1 - supp(Y))/(1 - conf(X implies Y))

There is a great paper available on http://www4.di.uminho.pt which explains all parameters in depth. The metric called PS (for Piatesky-Shaprio) is called leverage in the document.

- Sr. Director Data Solutions, Altair RapidMiner -

Dortmund, Germany

Dortmund, Germany

Tagged:

4