Help on How to Use FP-Growth
Can someone help me on how to use the FP-Growth operator? I am new to Rapidminder and try to use it to do some data mining work.
Here is the toy problem I used:
Transaction Beef Boots Cheese Chicken Clothes Milk
1 TRUE FALSE FALSE TRUE FALSE TRUE
2 TRUE FALSE TRUE FALSE FALSE FALSE
3 FALSE TRUE TRUE FALSE FALSE FALSE
4 TRUE FALSE TRUE TRUE FALSE FALSE
5 TRUE FALSE TRUE TRUE TRUE TRUE
6 FALSE FALSE FALSE TRUE TRUE TRUE
7 FALSE FALSE FALSE TRUE TRUE TRUE
With minimum support is set at 0.3, I can easily find the frequent itemsets as the following:
Itemset Trans Count Support
Beef 4 0.57
Cheese 4 0.57
Chicken 5 0.71
Clothes 3 0.43
Milk 4 0.57
Beef, Cheese 3 0.43
Beef, Chicken 3 0.43
Chicken, Clothes 3 0.43
Chicken, Milk 4 0.57
Clothes, Milk 3 0.43
Chicken, Clothes, Milk 3 0.43
However, FP-Growth outputs:
Size Support Item1 Item2
1 0.571 Cheese
1 0.429 Milk
1 0.429 Clothes
1 0.429 Beef
2 0.429 Cheese Milk
Both the support value and the itemsets are different from hand calculation.
I only used two operators: one for retrieve the data from repository (I checked the data output.
The data looks good) and FP-Growth with "Find min number of itersets" un-checked and the "min support" set to 0.3.
Maybe there are some parameters I should set up? Really appreciate your help!