Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

"Excel Text Processing-Frequency"

waynestate13waynestate13 Member Posts: 3 Contributor I
edited May 2019 in Help
Hello All,

I am new to RapidMiner and after watching some youtube tutorials and going through the tutorial I have some questions regarding text processing.

I have an excel file with about 300 comments (all comments in one line).  I would like to process the data to ultimately have Rapid Miner come up with association rules. However, I am having trouble to process the data.

So far I am using: Read Document (uploading data in txt file) then from there process document where the vector is set up based on term occurrences. Inside the Process Document there is a tokenizer that is tokenizing based on an expressin (&). In the text file, after each comment I have  included &.

However, I am not able to even get the data to read.

Thanks in advance for any feedback.

Answers

  • haddockhaddock Member Posts: 849 Maven
    Hi Wayne,

    I think the problem is the '&', but without looking at  your setup it's difficult to say; in regex use ampersands need to be escaped in order to be seen 'literally' (& -> \&), and not in their regex role. Just a thought.

    Hope that helps!
Sign In or Register to comment.