SOLVED
Highlighted
Contributor

# Attributes in credit risk model

Hi all,

I'm currently building my own credit risk model in RM and I have an issue after one of the steps. Without going to much into detail about the model itself, here a the steps leading to my issue:

1) Bin numeric attributes

2) Obtain number of defaults and non-defaults in each bin

3) Make certain calculations

My current outcome is a table looking like this:

 Attribute 1 Defaults NonDefaults DefaultPercentage NonDefaultPercentage DefaultRate WOE IV range1 [-∞ - 0.149] 19,0 29,0 ,2 ,1 ,7 -,7 ,1 range2 [0.149 - 0.304] 19,0 30,0 ,2 ,1 ,6 -,6 ,1 range3 [0.304 - 0.453] 13,0 36,0 ,1 ,1 ,4 -,1 ,0 range4 [0.453 - 0.680] 14,0 35,0 ,1 ,1 ,4 -,2 ,0

But this is only for 1 attribute, while I need to do this for at least 10-15 attributes. What I specifically need is the above output, but with an extra column on the left where the attribute is named next to the bin, and with all the attributes below each other. Thus, for the above example, it would result in:

 Attributes Bins Defaults NonDefaults DefaultPercentage NonDefaultPercentage DefaultRate WOE IV Attribute 1 range1 [-∞ - 0.149] 19,0 29,0 ,2 ,1 ,7 -,7 ,1 Attribute 1 range2 [0.149 - 0.304] 19,0 30,0 ,2 ,1 ,6 -,6 ,1 Attribute 1 range3 [0.304 - 0.453] 13,0 36,0 ,1 ,1 ,4 -,1 ,0 Attribute 1 range4 [0.453 - 0.680] 14,0 35,0 ,1 ,1 ,4 -,2 ,0 Attribute 2 range1 [-∞ - 0.011] 9,0 39,0 ,1 ,1 ,2 ,4 ,0 Attribute 2 range2 [0.011 - 0,024] 6,0 43,0 ,1 ,1 ,1 ,9 ,1 Attribute 2 range3 [0.024 - 0.037] 5,0 44,0 ,1 ,2 ,1 1,1 ,1 Attribute 2 range4 [0,037 - ∞] 8,0 41,0 ,1 ,1 ,2 ,5 ,0

And so on for al the attributes. Now I have to perform and hard code al the attributes seperately, which is not very efficient.

I already tried the loop attributes operator, but I don't seem to get it working.

I used the standard credit risk model data set available in RapidMiner. If I need to add more detail regarding the process itself, just ask!

Any thoughts?

Elite II

## Re: Attributes in credit risk model

Tim,

This was a bit trickier than I anticipated, but the attached process shows you how to do this using loops.  The first loop subprocess bins the variables and then creates an attribute corresponding to the relevant attribute name.  The second loop goes through these attributes and aggregates the data with the performance information you want, once for each attribute, and then appends them all together into one large table.

This was done using some randomly generated data, so obviously you will need to modify the process to refer to your attributes and then calculate the performance variables you are interested in through the aggregate operator, but the general structure should show you how this would work.  You may also need to rename your attributes to take advantage of the macro capabilities here, but that is easily done in the second loop after you have already created the attribute name in the first loop.  I hope this is helpful.

Regards,

Brian T., Lindon Ventures - www.lindonventures.com
Analytics Consulting by Certified RapidMiner Analysts