The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.
Options

"Excel Text Processing-Frequency"

waynestate13waynestate13 Member Posts: 3 Contributor I
edited May 2019 in Help
Hello All,

I am new to RapidMiner and after watching some youtube tutorials and going through the tutorial I have some questions regarding text processing.

I have an excel file with about 300 comments (all comments in one line).  I would like to process the data to ultimately have Rapid Miner come up with association rules. However, I am having trouble to process the data.

So far I am using: Read Document (uploading data in txt file) then from there process document where the vector is set up based on term occurrences. Inside the Process Document there is a tokenizer that is tokenizing based on an expressin (&). In the text file, after each comment I have  included &.

However, I am not able to even get the data to read.

Thanks in advance for any feedback.

Answers

  • Options
    haddockhaddock Member Posts: 849 Maven
    Hi Wayne,

    I think the problem is the '&', but without looking at  your setup it's difficult to say; in regex use ampersands need to be escaped in order to be seen 'literally' (& -> \&), and not in their regex role. Just a thought.

    Hope that helps!
Sign In or Register to comment.