Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
"Market-Basket-Wrong-Approach?"
SunnyLotusFlowe
Member Posts: 37 Contributor II
Hello everyone,
i wanted to use some concept hierarchy for a Data set like this
C-ID | Product
001 | apple
002 | Banana
001 | Banana
003 | Chair
_____________________
i want to map all apple and Banana Values to Fruit and insert new transaction-examples
C-ID | Product
001 | apple
002 | Banana
001 | Banana
003 | Chair
001 | Fruit
002 | Fruit
001 | Fruit
Is this approach wrong? does the support vaules for the rules will stay the same ?
After the pivoting of this i get the following:
C-ID | apple | Banana | Fruit | chair
001 | 1 | 1 | 2 | 0
002 | 0 | 1 | 1 | 0
003 | 0 | 0 | 0 | 1
is it a problem that the third column has a '2' ??? if i apply now the numerical to binominal i will get lost information i think
because the '2' will get mapped to true.
My two questtions are:
1. if i mine the table above, will i get something that is wrong ??? because of the value '2'
2. Is my approach for the concept hierarchy the right one ???
greetings
Lotus
i wanted to use some concept hierarchy for a Data set like this
C-ID | Product
001 | apple
002 | Banana
001 | Banana
003 | Chair
_____________________
i want to map all apple and Banana Values to Fruit and insert new transaction-examples
C-ID | Product
001 | apple
002 | Banana
001 | Banana
003 | Chair
001 | Fruit
002 | Fruit
001 | Fruit
Is this approach wrong? does the support vaules for the rules will stay the same ?
After the pivoting of this i get the following:
C-ID | apple | Banana | Fruit | chair
001 | 1 | 1 | 2 | 0
002 | 0 | 1 | 1 | 0
003 | 0 | 0 | 0 | 1
is it a problem that the third column has a '2' ??? if i apply now the numerical to binominal i will get lost information i think
because the '2' will get mapped to true.
My two questtions are:
1. if i mine the table above, will i get something that is wrong ??? because of the value '2'
2. Is my approach for the concept hierarchy the right one ???
greetings
Lotus
Tagged:
0
Answers
do i 'deform' the data, by adding further Customer ID to it ?
i think i change the support of the rules, which i extract later...
are there any standart-solutions ?
but if i understand u correctly theres no possible simple trick with the operators avaible ... ?
All I want to add is: if you are clear about the idea of what you want to achieve (in data analysis) with RapidMiner I am giving you my word that you can achieve it with the available operators. I would place a bet here: tell me your analysis problem and I am betting that I will be able to create the process for this.
Cheers,
Ingo
what do you expect by us? To read all the papers and provide you with condensed knowledge about this topic for every single question you got? Sorry, but I really like Hadocks answers here: you do some kind of report (you have to do?) and you are not willing to dive deep enough into the theoretical details or into the software you want to report about? Instead of that you totally rely on the answers of the community - which might be correct. Or wrong. Maybe.
________________________________________________________________________
well i try my best to understand the concepts, but if i dont know what to do i ask here in the forum. and if some nice person answer me i try his idea and if it work for my purpose i write it down. this is btw one of the last questions.
i think about it on my own, but i got confused from time to time and therefore asking.
___________________________________________________________________________________
in this topic i just wanted to see if i got the right idea to set up a Concept Hierarchy in basket market analysis. but i got confused if i crush the context of the underlying data with my idea. i can extract some assoc rules with my 'transformation' (= adding generalized items) but i'm not sure 100% that values like the support are correct.
greetings
Lotus
_____________________________________________________________________
but i build a workflow (to answer 1 of my questions):
i used 1 dataset like
cardid,Product
001,apple
001,banana
002,banana
and the other
cardid,Product
001,apple
001,banana
002,banana
001,fruit
002,fruit
to check the values like support | confidence ....
In both result there were the following assoc rules:
apple -> banana
banana -> apple
Both rules got the same values in the results. I mean using the first data set the rule 'apple -> banana' got the same values than with the other data set and vice versa.
Therefore i conclude that inserting generalized transaction-examples wont have an effect on the values of assoc rules that are not containing the new inserted items...
again sry for the trouble...
greetings
Lotus
no need to be sorry. I think the bottom line here is that the way you are asking combined with the amount of questions will not really motivate people.
Back to the topic: I would expect that adding those artificial transactions would actually change the support values. Look at your example: For the first data set the support of "apple" is 1/3 and for the second it is 1/5. So the question arises why you don't analyse both problems separately? First time based on the items and another time based on their groups. In this case you would however not get rules mixing items and groups. If this is necessary, other actions - like adding those artificial transactions - might be the best you can do.
Cheers,
Ingo