Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
[SOLVED] Transactional2Basket Problem
Hello there friends!
I'm learning about Market Basket Analysis, and I can across the Transactional2Basket Preprocessing sample process. Now, the first thing that I noticed when I ran it (with the default data set) was the following line:
WARNING: FPGrowth: Removed 1 non-binominal attribute, frequent item set mining is only supported for the positive values of binominal attributes.
I ignored it for the time being, since I assume the data is correct since it came with the software.
To make things clear, the default data (Market-Data) looks like this:
Row TID ITEM
1 1.0 1.0
2 1.0 2.0
3 1.0 3.0
4 2.0 1.0
5 3.0 4.0
6 3.0 5.0
7 3.0 6.0
Where TID is Transaction ID, and ITEM is the Item ID
Now my problem is the following:
According to the results, Items 2 and 3 have a support of 0.667, and Item 1 has only 0.333.
Now correct me if I am wrong, but looking at the data, it is clear that Item 1 has the higher support. Where Items 2 and 3 have a support of only 0.333.
Thank you for your time. Any input would be appreciated.
~Dr. Chen
I'm learning about Market Basket Analysis, and I can across the Transactional2Basket Preprocessing sample process. Now, the first thing that I noticed when I ran it (with the default data set) was the following line:
WARNING: FPGrowth: Removed 1 non-binominal attribute, frequent item set mining is only supported for the positive values of binominal attributes.
I ignored it for the time being, since I assume the data is correct since it came with the software.
To make things clear, the default data (Market-Data) looks like this:
Row TID ITEM
1 1.0 1.0
2 1.0 2.0
3 1.0 3.0
4 2.0 1.0
5 3.0 4.0
6 3.0 5.0
7 3.0 6.0
Where TID is Transaction ID, and ITEM is the Item ID
Now my problem is the following:
According to the results, Items 2 and 3 have a support of 0.667, and Item 1 has only 0.333.
Now correct me if I am wrong, but looking at the data, it is clear that Item 1 has the higher support. Where Items 2 and 3 have a support of only 0.333.
Thank you for your time. Any input would be appreciated.
~Dr. Chen
0
Answers
thanks for the hint i've fixed it in the current SVN version. The example process isn't totally correct. The FPGrowth is missing a value for the 'positive value' parameter.
Here is the correct process: Best,
Nils
Now the answer makes a lot more sense.
~Dr. Chen