identical type of attribute is separated in statistic view into two bulks, why?

AtillaAtilla Member Posts: 11 Learner I
edited May 8 in Help
In my given dataset (training and test)  theres an attribute called department. This feature contains  2 categories  sewing and finishing - this two are obviously nominal types. In rapid miner on the 'Statistic' view the categories (sewing, finishing) of the attribute department are visualized as a bulk diagram where the category finishing is shown twice - separated into two bulks in the diagram. My question is: What is the reason that in the 'Statistic'  view the same category (finishing) is separated into two bulks?
Normaly I am expecting to see 2 bulks (sewing, finishing) but on the statistic view there are three bulks (sewing, finishing, finishing). Back on the 'Data' view I only see the feature department and its 2 categories but the statistic view displays 3 categories (see visualization) which I can do not understand why. Maybe I do not understand the visualization view or even the view is just incorrect but the dataset is right. So in the end maybe I just need to choose the right diagram in order to get an accurate view.

Best Answers

  • BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 788 Unicorn
    Solution Accepted
    Hi!

    Can you spot the problem?



    I added some characters to the department name. You have spaces at the end of the department name. Use the Trim operator to clean these up.

    Regards,
    Balázs



    Atilla
  • BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 788 Unicorn
    Solution Accepted
    Hi!

    In this case I used Generate Attributes with a formula like: ">" + attributename + "<"

    Regards,
    Balázs
    Atilla

Answers

  • AtillaAtilla Member Posts: 11 Learner I
    This is awesome. I am just a speck away from understanding the way how you detected the spaces in the department name (>finishing <)?  Will try your suggestion about  the trim operator in order to get the variables(names) of the department trimmed. I appreciate your help 🙌
Sign In or Register to comment.