Sumarization of textual data

sukhsukh Member Posts: 43 Contributor II
edited November 2018 in Help
hi,
i want to know how to do the summarization of the text in rapid miner.
Regards:
Sukh

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,497 RM Data Scientist
    Hi,

    do you mean number of words etc?
    If so, have a look on the extraction folder. There are some operators for that.

    Cheers,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • sukhsukh Member Posts: 43 Contributor II
    Respected Sir,
    actually i want to know that if we have the text like last year, it was good.(date of post is 06-04-2015)  .This means that it is about 2014 . So, we need to subtract  one from the given year in date of post.
    how we implement this rule ..
    Kindly help me over it.

    Thanks and Regards:
    Sukh
  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 578 Unicorn
    Sounds like you need to build a dictionary of terms and how they map, probably using RegEx. 

    You need to first create a table listing all possible terms in your document that might refer to time and the rules that you want to apply to them.
    Then once you subtract out all those features (probably using RegEx) then you would apply to matching rule from your table. 
    termmeaning
    last yearyear-1
    this yearyear-0
    [0-9] years agoyear-x
    etc, etc.

    In summary:
    1. List your features & rules.
    2. Extract your features.
    3. Apply your matching rules.
  • sukhsukh Member Posts: 43 Contributor II
    Respected Sir,

    Thanks for your reply. As u suggest me for having the dictionary. but then how i use that dictionary in rapid miner. As i used one dictionary for stemming by using Stem(dictionary) operator. but for embedding the rules please suggest me the appropriate operator.

    Thanks and Regard:
    Sukh
Sign In or Register to comment.