Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Relative Contribution of Variables
I want to measure the relative contribution of each input variable to the prediction power/accuracy of a model (any classification or regression model). In some commercial tools like SPSS Modeler this is done automatically by a process so called leave-one-out. In each iteration one input variable is left out of the modeling and the model is tested on holdout sample (or via x-validation), the accuracy is recorded (e.g., variable left out = A, accuracy 82%). This process is repeated for each input variable. At the end you have a list of accuracies for each variable's-absence from the model. The lower the accuracy, the higher the contribution/importance of the variable that is left out. Once done, this accuracies can be converted/inversed into relative importance measures (can also be normalized), and shown using a horizontal bar chart illustrating the relative contribution of all variables.
I tried to do this in RapidMiner 7.0 with Loop Attributes note. It did not work! I could not set it up properly because I am not all that familiar with RapidMiner procedures like loop operators. The short descriptions were not sufficient enough for me to understand and use them properly for this process.
Can anyone create a simple process for a small data set like Golf and Decision Trees and X-Validation for the variable contribution procedure I described, and post it here so that we all can learn/benefit from it?
Thank you.
I tried to do this in RapidMiner 7.0 with Loop Attributes note. It did not work! I could not set it up properly because I am not all that familiar with RapidMiner procedures like loop operators. The short descriptions were not sufficient enough for me to understand and use them properly for this process.
Can anyone create a simple process for a small data set like Golf and Decision Trees and X-Validation for the variable contribution procedure I described, and post it here so that we all can learn/benefit from it?
Thank you.
0
Answers