Search a word list in a database

keskinovkeskinov Member Posts: 2 Contributor I
edited August 2019 in Help
Hi everyone,

I'm trying to find out how to search a word list in a database with RapidMiner for some time now, but cannot find a solution. I will be extremely thankful if somebody can help me.

I'm working with a database consisting of 25.000 rows with relative large amount of text data. I have extracted a word list with the top 10% of the most frequent words in this database. The total count of unique words is around 35.000. The most frequent (over 100) words are around 3500. I need to find out in which rows I can find each word of the most frequent list. In other words I need to create a binary vector matrix with 25.000 rows and 3500 columns. Can I do that with RapidMiner?

Thank you very much in advance!

Answers

  • keskinovkeskinov Member Posts: 2 Contributor I
    Hi everyone,

    I have one more question. Does somebody know a rule or an example from an article that states how many percent from the most frequent words in database should be analyzed so that one can become the most reliable results (f.e. the top 10%)?

    Thank you!
Sign In or Register to comment.