Options

How do I get results percentage-wise

basti123basti123 Member Posts: 1 Contributor I
edited November 2018 in Help
Hello,

I have an example set with words and the number how often the word appears in the example set.
For example:             the        30                car        20                tree        15                street        5                state        1   
I already sorted decreasing so I see the words which appear the most on top.  (Process Document -> WordList to Data -> Sort)
Now I want to pick the words which appear more than 60% in the example set.
How is this possible?
Thanks for your help!

Answers

  • Options
    SkirzynskiSkirzynski Member Posts: 164 Maven
    First of all you need the absolute number of examples in your original example set. Either you add this number as an additional attribute to your word count example set or you provide a macro with this value. To add such a macro you just need your original example set and the "Extract Macro" operator which has the macro type "number_of_examples". Name this macro "examples" and after this operator was executed you can access this number everywhere in the process - typically with %{examples} for instance.

    To add percentage to your given example set use the "Generate Attribute" operator and add an attribute named "percentage". Let us assume you attribute with the word count is called "count" than your desired function expression is
    count/parse(macro("examples"))*100
    Please note that although you typically use %{examples} to access a macro, in this case you need to use the function "macro(...)" since the regular syntax is interfering with the parser of the function expression.
  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    The next version of RapidMiner will also feature new aggregation functions in the Aggregate operator: sum (fractional), count (fractional), count (percentage) and string concatenation. Once RapidMiner 5.3 is released, you can ease your life with these. We don't have a schedule for the release yet, but if you are keen and firm with eclipse and alike, you can use the current svn version until then.

    Best regards,
    Marius
  • Options
    Dimpho_TsoeuteDimpho_Tsoeute Member Posts: 1 Contributor I
    select the sub sample of business students. what is the percentage
Sign In or Register to comment.