Questions about customer clustering/segmentation

FranziFranzi Member Posts: 2 Contributor I
edited November 2018 in Help

Hello,

 

I’m new to rapidminer, I did all the tutorials, but when I try my own cases, its a bit difficult to find the rigth operators and parameters.

 

I want to cluster my customers (CustomerID) in three groups based on their transactions.

 

Transactionsattributes are:

 

Date of transaction (datatype: date)

Value of transaction (datatype: integer)

Number of transactions (datatype: integer)

 

I would like to give the customers with following features a higher rate (weight)

 

  • more than one transactions
  • with a higher transactionsvalue than average
  • recent transactions (i.e. transactions in the last month)

 

Is their any possibilty to create a process in rapidminer, that reflect my requirements?

Which operator would be best for that use case?

 

Thanks for your help in advance and sorry for my poor english!

Franzi

Best Answer

  • yyhuangyyhuang Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 364 RM Data Scientist
    Solution Accepted

    'Generate Attribute' operator is your good friend, to achieve your goal 'to give the customers with following features a higher rate (weight)'

    you can create several indicator attributes, for instance, to tag the customers who has any more than one transactions, 

     

    attribute name                                function expression

    AnyTransaction                             if(Number of transactions>1, 1,0)

     

    You can refer to the tutorial process for Generate Attribute, and get inspired by the example function expressions.

    GA.PNG

    Happy RapidMining!

     

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist

    Dear Franzi,

     

    my key question for you is: Do you want to classify/cluster by your own rules or by computer generated rules based on statistical reasoning?

     

    In rapidminer we got a lot of operators which group customers together by their attributes. They find the rules for the grouping which are the best - given some statistical measure. Most likely they will be similar to the groups you had in mind, but not necessarly.

     

    The operators for this would be: K-Means, K-Medoids, DBScan or maybe Agglomerative Clustering. Please be aware that all of those operators use a distance measure and thus need normalized data. You can normalize your data with teh Normalize operator.

     

    Best,

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • FranziFranzi Member Posts: 2 Contributor I

    Thank you a lot! The "Generate Attribute" helped me out.

     

     

Sign In or Register to comment.