The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.
Options

H20 Variable Importance

User36964User36964 Member, University Professor Posts: 15 University Professor
Deep Learning model of H20 provides "compute variable importance" choice. 
If elected the output of Deep Learning Model list the top ten important attributes. Is there a way to increase this number to top 20 or 100? 

Answers

  • Options
    varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    edited April 2019
    Hello @User36964

    I am sure that H2O is calculating variable importance for all variables in your dataset, I think its the rapidminer view that is restricting you to see all variables. I can see the top 10 and least 10 variables and their importance. I don't see any option to extend this 

    @hughesfleming68 any suggestion on this?

    Thanks
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

  • Options
    User36964User36964 Member, University Professor Posts: 15 University Professor
    I wonder why they limit it.  Anyone use this extension need to see the all variable importance. 
  • Options
    varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    edited April 2019
    I understand your concern @User36964, but in some cases, if the attributes are very high (100's), it is difficult to view all of them. There should have been an option to extract variable importance.

    You can use explain predictions operator to see which variables impacted your model predictions. I will use this a lot compared to variable importance. One reason is the limitation of H2O variable importance method (Gedeon based) that extracts importance based on weights only from first two layers of a network, but for large networks, it is not good as the deeper networks can influence your variable importance.

    Everything has their own limitations :smile:
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

  • Options
    User36964User36964 Member, University Professor Posts: 15 University Professor
    edited April 2019
    As you've stated there should be an option to import all the importances.

    The explain predictions operator explain the prediction of each data row. Thus, gathering a general (overall) idea of the important attributes is somewhat challenging. May be attribute frequencies for each prediction can be calculated manually to find the most effective supporting and contradicting attributes.


  • Options
    varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    Actually, I do some extra analysis using explain predictions operator. A few more operators additional to explain predictions will provide me with attributes that support and contradict correct predictions as well as wrong predictions. If you are interested you can take a look at below thread for the process of extracting attributes based on outcome (correct or incorrect).

    I am trying to work on some feature selection techniques based on this operator, if @IngoRM does it earlier it will be available in RM.

    https://community.rapidminer.com/discussion/55351/explain-predictions-ranking-attributes-that-supports-and-contradicts-correct-predictions#latest

    Thanks
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

  • Options
    User36964User36964 Member, University Professor Posts: 15 University Professor
    Thanks,
    I'm looking forward to see your solutions 
Sign In or Register to comment.