RapidMiner now offering a 30 day free trial of RapidMiner Studio Large! Learn more

Clustering in Text Mining

Clustering in Text Mining

I've been using the text processing package for RapidMiner and am currently trying to do clustering and association rules with text documents.  I've followed all of the steps in this Vancouver Data help video (http://vancouverdata.blogspot.com/2010/11/text-analytics-with-rapidminer-part-3.html) and built the exact same process, but have not been able to generate results.  When I try to run the process, it runs for around 20-30 minutes (as opposed to a few seconds on the video) before telling me that I have run out of memory.  I'm not dealing with large documents, only two small text files. I allocated 4GB to the program so memory shouldn't be an issue, but I keep getting this error message.  A similar thing happens with any other clustering process I try to do. 
Does anyone have any advice as to how to solve the problem? 
RM Certified Expert
RM Certified Expert

Re: Clustering in Text Mining

How many documents are you processing? Are you sure the 4GB are actually available to RapidMiner? How did you allocate them, how do you start RapidMiner, and which operating system are you using?

Best regards,

Marius Helf
ezCater's RapidMiner Journey