Text Data Cleaning, Preprocessing and Text Mining with Rapidminer
I have a large number of English texts (online reviews). I need to do a data cleaning, preprocessing on them and then mine which are the effective high-frequency words? For example, "bathroom", "traffic", etc., are likely to appear as valid high-frequency words. Do you have any specific steps to do so?