It looks like you're new here. Sign in or register to get started.
is Association analysis (e.g FP-Growth) also suited to do calculations and discover relationships on numerical (real) data columns? or only for categorical variables?
if it is, I'd like how to do so ....
Take a look a the tutorial process for FP-Growth, it is not a perfect tutorial to explain FP-Growth but it happen to use Iris data which only has continous numerical variable/attributes. The tutorial uses Discretize by Frequency and convert nominal to binomial before applying FP-Growth.
Keep in mind that all attributes of the input example for FP-Growth are required to be binominal.
What is your user case/purposes to apply FP-Growth on real/continuous data columns?
ok but I want to do FP-Growth regarding my class values (1,3 or 4). Identify item-sets that appear with a certain kind of support in regard to a given label class...
but I think its probably not well suited for problems with 20+ numerical parameters, you would have to discretize them and then divide them into binomial values bigger or smaller than half of the interval... probably makes not much sense
FP-Growth is a algorithm initially designed for Market Basket Analysis. Either you buy a product or not. It is also used in other use cases e.g. Webpagevisits. But it's always a "either you did it or not". In fact it also ignores if you took it twice or not. That's simply how the algorithm works and nothing RM specific. That's why it can only run on binary and not on numerical data.