Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Textual ETL: Stemming from dictionary

WanttoknowWanttoknow Member Posts: 6 Contributor II
edited November 2019 in Help
Hi,

First of all I have to say that RM5.0 is a wonderful tool. :o Congratulations.

I started with pre processing text for classification and I am having some problems with the "Stem (Dictionary)" component.

I am referring to a textfile for the patterns but I am not sure about the syntax of the entries/records in the textfile. The help is very brief about this

Right now the first line in my designated TXT file looks like this:

"move: moving moved move"

But it is not replacing any of the terms to their stem.

Any idea?

Answers

  • arminmaniaarminmania Member Posts: 7 Contributor II
    Hi,

    I am not sure, but I think you have to write as followed:

    move , moving moved move
  • TobiasMalbrechtTobiasMalbrecht Moderator, Employee, Member Posts: 295 RM Product Management
    Hi,
    Wanttoknow wrote:

    Right now the first line in my designated TXT file looks like this:

    "move: moving moved move"
    did you try to put a blank before the colon?

    Kind regards,
    Tobias
  • WanttoknowWanttoknow Member Posts: 6 Contributor II
    Well, after a lot of trail and error this seems to work

    "
    aanleveren:aanlever.*
    aanleveren:aangelever.*
    zorgverzekering:zorgverzeker.*
    "
    But putting multiple patterns on 1 line like this "aanleveren : aanlever* aangelever*" doesn't work.

  • WanttoknowWanttoknow Member Posts: 6 Contributor II
    Another question:

    Is it possible to use an external list for the ReplaceToken component? That would be more convenient than entering records with the list editor of the component.
Sign In or Register to comment.