"Text Mining"

nikhilnikhil Member Posts: 2 Contributor I
edited May 2019 in Help

I have imported an excel file (one column) containing the comments from a survey and I need to extract information from this file. I need to do text processing on this file. I am new to this topic. Can you please suggest how to proceed? I have a file whose output is 'Nominal'. The input required is doc for tokenization and extraction. I used nominal to text but still it doesn't accept the input.
I get an error message " Message: com.rapidminer.example.set.SimpleExampleSet cannot be cast to com.rapidminer.operator.text.Document".
How should i proceed with this issue?
Is there a help file for text processing techniques?

I also tried converting to "Nominal to text" and using "data to documents". The result returns zero rows.

Looking forward to your reply.

- Nikhil


  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Nikhil,
    I would suggest that you make yourself familiar with the concepts of RapidMiner first. Therefore it is a good help to watch the tutorial videos linked on http://rapid-i.com/content/view/189/212/lang,en/ or those who can be found on youtube. Then taking some time for analysing the sample processes inside the sample repository will help you a lot.

    After this you will be familiar with RapidMiner and can head for new challenges. Probably it helps to read the help text of the few text extension operators. This will give you a rough impression of what can be combined.

    After this steps you will know that you have to load the data, transform nominal to text, use process documents from data where you have to insert a proper subprocess.

    A fast track to all this would be to visit one of our text mining webinars which are available in our shop.

  • Options
    nikhilnikhil Member Posts: 2 Contributor I

    Thank you for the response.

    I saw Markus Hofmann's video on Text Mining and it has helped me a lot. I found rapidminer to be very interesting and powerful.

    I have a course completion survey and i need to analyze the comments from the survey.
    I have an excel sheet with the comments. I used 'Nominal to Text' operator followed by 'Process documents to data' operator. I used 'Tokenization', 'Filter' and 'Stemming' operators inside the 'Process documents to data' operator as shown in the video.

    Can you suggest some methods I can use for this analysis? How do I proceed with this analysis after getting the keywords and their count?

    Thanks and Regards,

  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    actually I don't know. This depends on your type of analysis you are going to conduct, on the thing you want to find out.
    So we would to have to go in more detail, BUT this is not the location for doing so. We are trying to help people if they are stuck with some detail problem, but we can't afford everybody help for free on every problem. This kind of detail is reserved for paying enterprise customer. I hope you understand, that we have to sell our services as you can use our program for free already.

Sign In or Register to comment.