Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Market Basket Analysis
I have a very simple excel file with 2 columns.
Invoice #
Item #
About 12k entries
I want to know what the most commonly purchased products are..
If Product A is always sold with Product B.. we can make a package deal.
Thoughts ?
Invoice #
Item #
About 12k entries
I want to know what the most commonly purchased products are..
If Product A is always sold with Product B.. we can make a package deal.
Thoughts ?
Tagged:
0
Best Answer
-
Spcalan14 Member Posts: 16 Learner Iah ok.. thank you very much for your help!
My first time post and you saved the day !
0
Answers
I have opened the Market Basket Analysis template, imported data, and I get the Association rules output..
Largest support is 0.026 for Product 12 and Product 15..
29 sets... Support = 0.047 ( highest )...
So does this mean Product 12 and Product 15 are most commonly purchased together and there are 29 sets to confirm ?
Can't be... Item 15 was only purchased 1x.. and Product 12 was purchased 32 times...
Yes, you can take a look at the process template called "Market Basket Analysis" which include the 2 following operators :
- FP-Growth
- Create Association Rules
Hope this helps,
Regards,
Lionel
But how do I interpret the results ?
of the 12k data points... only make up 1084...
I don't want to predict.. I just want to know what is my 2 most commonly purchased items on the same invoice
https://academy.rapidminer.com/learn/article/cross-selling-do-you-want-fries-with-that
https://academy.rapidminer.com/learn/video/text-association-rules
I would assume that these would be in the mix..
The "support" is defined by the proportion of transactions T which contain both X and Y.
So I would say that to find "the 2 most commonly purchased items on the same invoice" you have to find the association with the hightest value of "support".(for that you can sort the results of the Create Association Rules operator).
Regards,
Lionel
My descriptions are "CC-TT", and "CC-TTG".. not numbers..
How can I see the actual Product Description ( instead of Product 1, 2, 3 ) ?
1. Go to the results of "Association Rules" generated by the operator Create Association Rules.
2. Sort the table by descending order of "support" by clicking on the name of the column "Support"
3. The first row (Premises and Conclusion) indicates the "2 most commonly purchased items on the same invoice"
Regards,
Lionel
Product 12 and Product 15 ?
1. Go to the results of "Association Rules" generated by the operator Create Association Rules.
2. Sort the table by descending order of "support" by clicking on the name of the column "Support"
3. The first row (Premises and Conclusion) indicates the "2 most commonly purchased items on the same invoice"
But what is Product 12 and Product 15 ?
I need my product names...
My screenshots are coming from the RapidMiner Template which are fictive examples and not from your own data...
As said, run the process with your own data and go to the Association Rules results and you will see the 2 most commonly purchased items on the same invoice" of your own data....
If you are lost after this explanation, please share your data...
Regards,
Lionel
See below...
This makes MUCH more sense ( considering this is my data )..
LGL-TEE and CL-TEE makes much more sense....
But since I have 4 groups with same Support (0.017).. the are even..
What does the first column represent ?
If I first column..
Then STP-Tee and CP-TEE have a value of (59)...
Does that mean there were 59 instances of that specific bundle ?
Yes, Support is the key indicator.
I must admit that I don't know what the first column represent...
Regards,
Lionel
After reflexion, the first column is a kind of "Id", the number of the association rules...
By playing with the "Min. Criterion Value", you will see that there are more or less association rules :
Regards,
Lionel