RapidMiner 9.7 is Now Available

Lots of amazing new improvements including true version control! Learn more about what's new here.

CLICK HERE TO DOWNLOAD

words containing UMLAUTE in Text Mining

Thesis_12Thesis_12 Member Posts: 1 Contributor I
edited November 2018 in Help
Dear all,

apparently Rapid Miner is not able to search for certain words containing German Umlaute such as ä,ö,ü or also ß. When I search for the word "Änderung" in "regular expression" (in "Filter Tokens by Region" /condition: "contains match") it doesn't show any results.
I use version 5.3.005 on a Mac and am working with HTML documents. I know that the problem described above does not occur with an older version and Windows.

However, I need to solve this problem with version 5.3.005 on a mac.

I tried with " .{1,2}nderung" which worked but also gave me results like "Minderung" which was not intended.

I would be very glad if somebody knew a solution for this problem.

Thanks a lot

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869   Unicorn
    How do you retrieve your data?
    For some data retrieval operators you have to configure the correct encoding. If your input data is e.g. encoded in UTF-8 you have to configure that in the respective operator.

    Best regards,
    Marius
Sign In or Register to comment.