Options

Process Document output

AnkitAnkit Member Posts: 6 Contributor II
edited November 2019 in Help
Hi,

I am trying to get top 10 words based on there occurrence from the output of process document .But when I use generate attribute to apply some macors ;I can connect only examples of process document to generate attribute but not the word. Is there any other way to get only top 10 words along with count of there occurrence.

Any help will be appreciated.

Regard,
Ankit

Answers

  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi Ankit,

    if you want to process the word vector, you have to use the Wordvector to Data operator.

    Best regards,
    Marius
  • Options
    AnkitAnkit Member Posts: 6 Contributor II
    Thanks for the quick reply Marius!

    It worked for me , but I am still looking to limit the result to top 10 and that to descending order of there occurrence.

    Regards,
    Ankit
  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi Ankit,

    once you have the Word Vector converted to an example set, you can use the standard etl operators. In your case you could use Sort to sort the words according to their occurrences, and the use Filter Example Range to remove all but the 10 top most words.

    Hope this helps!

    Best regards,
    Marius
Sign In or Register to comment.