Market Basket Analysis - Operators to change data layout

ankranockankranock Member Posts: 2 Contributor I
edited November 2018 in Help

Hi,

 

I am new to Rapid Miner and I am trying to create a Market Basket Analysis Model using the FP-Growth Operator and the Create Association Operator.

 

I am reading a csv file into Rapid Miner and it looks like the Figure 1 below with attributes going out to att32 and 9,835 observations.  Each row represents a transaction.

Figure 1

Row

att1

att2

att3

att4

att5

1

tropical fruit

yogurt

coffee

?

?

2

whole milk

?

?

?

?

3

pip fruit

yogurt

cream cheese

meat spreads

?

4

other vegetables

whole milk

condensed milk

long life bakery product

?

5

whole milk

butter

yogurt

rice

abrasive cleaner

 

 

I believe the FP-Growth Operator is expecting an example data set like Figure 2 shown below.  Each id corresponds to an item.  The table expands to 167 items and 43,367 rows.

 

Row

id_1.0

id_10.0

id_11.0

id_12.0

id_13.0

id_14.0

Tran_ID

1

true

false

false

false

false

false

1

2

false

false

true

false

false

false

1

3

false

false

false

false

true

false

2

4

false

true

false

false

false

false

2

5

false

false

false

false

false

true

2

 

Are there operators within Rapid Miner that can transform the data from what’s in Figure 1 to something like what’s in Figure 2 that the FP-Growth Operator will like? If I can do it with item names instead of item numbers it would be even better.  I had to transform the layout of the data outside of Rapid Miner to make it work.

 

Thanks for any help or guidance you can provide.

Tagged:

Answers

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,316  RM Data Scientist

    Mh,

     

    good question! I would try De-Pivot on att.* to get a SQL-ish table for it, and then use Pivot and Numerical to Binominal to get what i what. Not sure if it really works but i think it does.

     

    ~Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • ankranockankranock Member Posts: 2 Contributor I

    Martin,

     

    I will play with the operators you mentioned and see if I can get them to work.  Thanks for the feedback

     

    --Alan--

Sign In or Register to comment.