how to predict response rate or responses in Rapid Miner

User111113User111113 Member Posts: 24 Maven
edited November 2019 in Help
Hi All,

I'm fairly new to Rapid Miner and looking for a way to predict response rate based on historical data from past 2 years.
I have customer id and categories and of course quantity mailed and responses 

for example

id    category     state   year   month     QtyMailed    Responses Received            Response Rate
1        a                OH    2018    oct           5000                  200                                    4%                          
1        b                CA    2018    Nov          10000               130
1        c               PA     2018    dec           35000               512
2
2

and so on.............. I would like to predict responses or response rate let's say for upcoming month 

Best Answers

Answers

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    You can try some simple ML algorithms like Decision Tree or Naive Bayes and see what they look like.  But if you only have data monthly you actually don't have that much data to train the model so don't be surprised if the fit is not that great.  If you review the cross validation operator tutorial it will provide some guidance on how you should set up this process.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • User111113User111113 Member Posts: 24 Maven
    @Telcontar120
    Thank you for your response.

    I tried a few things and looked at some examples. It gives me a lot of errors and asked me to auto fix which I don't even get how and why it is doing so. Only one time it ran and took year as a prediction value where it should be either responses or response rate. I am stuck not sure how to move forward
  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Hi @User111113,

    In order we can understand what's going on, could you share : 

     - your process ( via File --> Export Process)
     - your data

    Regards,

    Lionel
  • User111113User111113 Member Posts: 24 Maven
    This is my data File
  • User111113User111113 Member Posts: 24 Maven

  • User111113User111113 Member Posts: 24 Maven
    lionelderkrikor Here I am attaching the process
  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Hi @User111113,

    The error means that the attributes in your training set and the attributes in your test set are not strictly the same.
    This error is caused by the Nominal to Numerical operator in the training part of your Cross Validation operator which create attribute(s) in the training set and not in your test set.
    The solution is to move the Nominal to Numerical operator outside the CV operator.

    In attached file, the working process.

    Regards,

    Lionel
  • User111113User111113 Member Posts: 24 Maven
    edited December 2019
    @lionelderkrikor

    Thank you for your response. I used decision tree and it looks like it's working fine. I would like to know one more thing here, the responses these models are giving are based on what parameters like in my case I want the model to make predictions based on category and state or may be category, state and total mailed.

    Can I set it up myself so it looks only at those 2 or those 3 columns and predict the response.
  • User111113User111113 Member Posts: 24 Maven
    @lionelderkrikor

    I have a few more questions I guess....


    When I am trying automodel it shows "back" and "next button" sometimes and sometimes it doesn't. If you see the below screenshot I cannot go back or front... and sometimes it do shows up. Do you know how to resolve this.


  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Hi @User111113,

    Strange ! 

    Try to select the attribute you want to predict (the label).

    Regards,


    Lionel
  • User111113User111113 Member Posts: 24 Maven
    @lionelderkrikor
    @Telcontar120

    Thank you for your help.

    I have more parameters that I want to add to my data to predict responses but I wanted to see a better way. I have indexes which are like 0,1,2,3 let's say responses with index 0 is higher now my data will look like below. 


    id    category index    state   year   month     QtyMailed    Responses Received            
    1        a           0          OH    2018    oct           3000                 150                                                   
    1        a           1         OH    2018    oct           1000                 40                                                         1       a           2         OH    2018    oct           1000           10                                                             
    1        b                           CA    2018    Nov          10000               130
    1        c                             PA     2018    dec           35000               512
    2
    2


    my question is that I know important factors that changes the responses are indexes, state and month of the year but how much are they affecting like may be % wise can we find that out and is it also possible to feed data by counties or zip codes and then see if that makes any difference because people would have responded may be only from 3 zip codes and not from other 2....

    I have a lot in my mind hope I am not confusing anyone

    When I tried doing AutoModel it says "DeSelect" quantityMailed column and if I do that I knows it's not going to work as I saw response predicted and they were not up to the mark at all technically everything was same... so I never deselect that column 
  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Hi @User111113,

    I have difficulties to understand what is your question...
    Can you explain more explicitly what you get and what you want to obtain ?
    In the meantime, you can indeed apply your dataset to AutoModel. If you have doubt about one or more columns (attributes)
    first select it (them) and enable the Automatic Feature Selection before running AutoModel. If, in fine, these attributes are not relevant
    they will be removed from the final feature set.
    Concerning the "weights", you can see that for several models you have access to the weights of each regular attributes by clicking
    on Weights for a given model.

    Hope this helps,

    Regards,

    Lionel

  • User111113User111113 Member Posts: 24 Maven
    @lionelderkrikor

    I did more research and modified my data set and generated new models. my questions are:

    how can I reduce the error rate, have better performance ?

    do I need to validate my models? if yes, then how can we do it after we deployed models using auto-models?

    What do you think about grouping the models?
Sign In or Register to comment.