Attributes in credit risk model

tim_mercitim_merci Member Posts: 1 Contributor I
edited November 2018 in Help

Hi all,

 

I'm currently building my own credit risk model in RM and I have an issue after one of the steps. Without going to much into detail about the model itself, here a the steps leading to my issue:

 

1) Bin numeric attributes

2) Obtain number of defaults and non-defaults in each bin

3) Make certain calculations 

 

 My current outcome is a table looking like this:

 

Attribute 1 Defaults NonDefaults DefaultPercentage NonDefaultPercentage DefaultRate WOE IV
range1 [-∞ - 0.149] 19,0 29,0 ,2 ,1 ,7 -,7 ,1
range2 [0.149 - 0.304] 19,0 30,0 ,2 ,1 ,6 -,6 ,1
range3 [0.304 - 0.453] 13,0 36,0 ,1 ,1 ,4 -,1 ,0
range4 [0.453 - 0.680] 14,0 35,0 ,1 ,1 ,4 -,2 ,0
               

 

But this is only for 1 attribute, while I need to do this for at least 10-15 attributes. What I specifically need is the above output, but with an extra column on the left where the attribute is named next to the bin, and with all the attributes below each other. Thus, for the above example, it would result in:

 

Attributes Bins Defaults NonDefaults DefaultPercentage NonDefaultPercentage DefaultRate WOE IV
Attribute 1 range1 [-∞ - 0.149] 19,0 29,0 ,2 ,1 ,7 -,7 ,1
Attribute 1 range2 [0.149 - 0.304] 19,0 30,0 ,2 ,1 ,6 -,6 ,1
Attribute 1 range3 [0.304 - 0.453] 13,0 36,0 ,1 ,1 ,4 -,1 ,0
Attribute 1 range4 [0.453 - 0.680] 14,0 35,0 ,1 ,1 ,4 -,2 ,0
Attribute 2 range1 [-∞ - 0.011] 9,0 39,0 ,1 ,1 ,2 ,4 ,0
Attribute 2 range2 [0.011 - 0,024] 6,0 43,0 ,1 ,1 ,1 ,9 ,1
Attribute 2 range3 [0.024 - 0.037] 5,0 44,0 ,1 ,2 ,1 1,1 ,1
Attribute 2 range4 [0,037 - ∞] 8,0 41,0 ,1 ,1 ,2 ,5 ,0

 

And so on for al the attributes. Now I have to perform and hard code al the attributes seperately, which is not very efficient.

I already tried the loop attributes operator, but I don't seem to get it working. 

 

I used the standard credit risk model data set available in RapidMiner. If I need to add more detail regarding the process itself, just ask!

 

Any thoughts? 

Best Answer

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Solution Accepted

    Tim,

    This was a bit trickier than I anticipated, but the attached process shows you how to do this using loops.  The first loop subprocess bins the variables and then creates an attribute corresponding to the relevant attribute name.  The second loop goes through these attributes and aggregates the data with the performance information you want, once for each attribute, and then appends them all together into one large table.

     

    This was done using some randomly generated data, so obviously you will need to modify the process to refer to your attributes and then calculate the performance variables you are interested in through the aggregate operator, but the general structure should show you how this would work.  You may also need to rename your attributes to take advantage of the macro capabilities here, but that is easily done in the second loop after you have already created the attribute name in the first loop.  I hope this is helpful.

     

    Regards,

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.