Options

Which variables should i use/create for a prediction model?

cdapontecdaponte Member Posts: 29 Maven
Hi, i´m working in a model in order to predict if the debtor is going to pay or not. Would you recommend me any idea or suggestion? For example, creating some attributes or using a specific Model Operator?

Here i leave you my Data set.


Best Answer

Answers

  • Options
    varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    Hello @cdaponte

    Did you try rapidminer automodel? It will suggest models and also attributes useful in making predictions. You can also understand which attributes are important in making predictions.

    Give it a try.
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

  • Options
    cdapontecdaponte Member Posts: 29 Maven
    Thanks! Yes i already try it but, wow it´s very complex  :D
  • Options
    varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    Do you mean the operator connections in the model are very complex (or) to understand what is happening inside the process is complex?
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

  • Options
    cdapontecdaponte Member Posts: 29 Maven
    It´s difficult for me to understand what is happening inside the process. 
  • Options
    M_MartinM_Martin RapidMiner Certified Analyst, Member Posts: 125 Unicorn
    Hi:  After following varunm1's suggestions above, it might also be very helpful to share a summary of your RapidMiner process outputs with people within your organization who have familiarity with the business issues - i.e. people who have been around awhile and have a feel for factors that may play a role in determining whether or not a given loan has a risk of defaulting.  Discussing model outputs with experts also is a way of building trust in the models so that the organization will be more likely to deploy them and integrate predictive model deliverables into other data flows within the organization.  Best wishes, Michael Martin 
  • Options
    kypexinkypexin Moderator, RapidMiner Certified Analyst, Member Posts: 291 Unicorn
    Hi @cdaponte

    Regarding your dataset - as I understand it contains some historical data about Santander bank borrowers, right?
    Which column represents a target variable (the one you are trying to predict, whether the debtor has paid back)? Or maybe it should be derived somehow from other attributes? It's not easy to understand as column names formed from Spanish language not known to me. 
Sign In or Register to comment.