Options

What are the most important attributes that distinguish 3 nominal labels from each other?

lauschilauschi Member Posts: 4 Newbie
I have a problem where I do not know which model is suitable:

I have 3 nominal labels (1964, 1984, 1994). For all three labels, structural metrics (attributes) of the landscape (PD, Shape, ...) were calculated.
My question: What are the most important attributes that distinguish all 3 labels from each other?
Which model do I have to use here to be able to answer my question?
Many thanks for your help

Answers

  • Options
    BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn
    Hi!

    AutoModel can be used to automatically test some machine learning algorithms on your data and also to get an assessment of attribute importance. 

    If you don't have that available, you can use some of the "Weight by" operators. There is no "best" among those, so you'll need to try at least some and summarize their results. Just as there are machine learning algorithms with different approaches, determining the importance or weight of attributes depends on the approach taken.

    Regards,
    Balázs
  • Options
    lauschilauschi Member Posts: 4 Newbie
    Dear Balázs,

    thank you very much for your feedback.
    Well, I have the AutoModel available and I have also used it a lot. However, since I have 3 different categorical labels in my dataset, AutoModel always looks for the variables to predict one category at a time.

    Unfortunately, the meaning of the variables for the prediction of all 3 categories is always different.
    I will perhaps reduce the set of possible variables to a few. Maybe this is a good first step.
    Thank you for your support and feedback.

    Best regards,
    Lauschi


  • Options
    BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn
    Hi,

    yes, in machine learning the importance of attributes can differ between the algorithms being used, but also between data sets.

    You could always build a process that loops over different samples of the data, sets the three label attributes in a loop one by one, and then uses some of the Weight by ... operators to calculate the attribute importance for that sample, that label and that algorithm. Summarizing the results will possibly keep you some insights on the overall importance. You'll probably need "Weights to Data" to convert the weight table to a normal data table.

    Regards,
    Balázs
  • Options
    lauschilauschi Member Posts: 4 Newbie
    Dear Balazs Barany,
    Thank you very much for the very good comments.
    Could you send me a sample workflow of such a process? Then I could use it as a guide.

    Thank you again and best regards
    Lauschi

  • Options
    lauschilauschi Member Posts: 4 Newbie
    Dear Balázs
    Thank you very much for the very good advice.
    I think I have already been able to work a little on the solution.
    Thank you very much and happy holidays.
    Kind regards
    Lauschi


Sign In or Register to comment.