🎉 🎉. RAPIDMINER 9.8 IS OUT!!! 🎉 🎉
RapidMiner 9.8 continues to innovate in data science collaboration, connectivity and governance
Using HIndi Language in Tokenizer
My documents to be analyzed are in Hindi. The encoding format is UTF-8. For creating the word Vector I have used WVplugin. The problem is that I am not getting all the tokens (I used all the tokenizers in rapidminer 4.6), in fact i am getting too low - 4 to be precise ???
I changed the content language and encoding to Hindi and UTF, but without any sucess - is there any additional setup to be done to tokenize the text properly?