Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Association Analysis for numerical real data?

Fred12Fred12 Member Posts: 344 Unicorn
edited February 2020 in Help

hi,

is Association analysis (e.g FP-Growth) also suited to do calculations and discover relationships on numerical (real) data columns? or only for categorical variables?

if it is, I'd like how to do so ....

Answers

  • yyhuangyyhuang Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 364 RM Data Scientist

    Hi Fred,

     

    Take a look a the tutorial process for FP-Growth, it is not a perfect tutorial to explain FP-Growth but it happen to use Iris data which only has continous numerical variable/attributes. The tutorial uses Discretize by Frequency and convert nominal to binomial before applying FP-Growth.

    Keep in mind that all attributes of the input example for FP-Growth are required to be binominal.

    What is your user case/purposes to apply FP-Growth on real/continuous data columns?

  • Fred12Fred12 Member Posts: 344 Unicorn

    ok but I want to do FP-Growth regarding my class values (1,3 or 4). Identify item-sets that appear with a certain kind of support in regard to a given label class...

     

    but I think its probably not well suited for problems with 20+ numerical parameters, you would have to discretize them and then divide them into binomial values bigger or smaller than half of the interval... probably makes not much sense

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,529 RM Data Scientist

    Fred,

     

    FP-Growth is a algorithm initially designed for Market Basket Analysis. Either you buy a product or not. It is also used in other use cases e.g. Webpagevisits. But it's always a "either you did it or not". In fact it also ignores if you took it twice or not. That's simply how the algorithm works and nothing RM specific. That's why it can only run on binary and not on numerical data.

     

    ~Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.