Options

Aggregate binomial columns

LewishamLewisham Member Posts: 11 Contributor II
edited September 2019 in Help
Hello all,
I have a dataset that looks like:

User | Item
-------------
1 | Cheese
1 | Bread
2 | Milk

I'd like to mine the frequent item sets from this data. First thing I did was feed this to "Nominal to Binomial" which seems to work as expected, eg:

User | Cheese | Bread | Milk
------------------------------------------------------------
1 | true | false | false
1 | false | true | false
2 | false | false | true

What I now need to do is aggregate by user ID to generate:

User | Cheese | Bread | Milk
------------------------------------------------------------
1 | true | true | false
2 | false | false | true

I thought I could do this with the Aggregate operator, but that operator seems completely blind to the binomial columns; I can't find any way of selecting them.

What should I be doing here?

Thank you! :)
Tagged:

Answers

  • Options
    ReneRene Member Posts: 24 Contributor II
    The "Pivot" operator (group='user' , index='Item') does this.
    (I think there's an FP Growth example somewhere in the Rapid Miner tutorials which shows this.)

    Greets,
    René
  • Options
    LewishamLewisham Member Posts: 11 Contributor II
    Hi Rene,
    Thanks for the reply. This doesn't seem to do what I want; again, the binomial columns appear invisible to the Pivot operator.

    The FPGrowth example has the data already in a collapsed format, so it can be passed directly to the FPGrowth operator. I can't seem to do this in RapidMiner.

    However, I managed to hack around it by using GROUP_CONCAT in the SQL query, use the Split operator to break up the column into new attributes, then feed it to Nominal to Binomial and off to the FPGrowth operator!

    Shame my data didn't seem to have any sets  :D
  • Options
    ReneRene Member Posts: 24 Contributor II
    the binomial columns appear invisible to the Pivot operator
    I meant pivoting your first table (User | Iterm | Amount=1). This would have resulted in your desired example set.
    Starting from your 2nd table, I'd first think of logcally OR'ing the observations for each user ID. I'm sure there are some nifty matrix operations for archieving that. But, as a beginner, I got to pass ... ;)

     

  • Options
    cacetercaceter Member Posts: 2 Contributor I

    I've encountered the same problem, any solutions? Aggregate operator is not working for binominal values.

Sign In or Register to comment.