🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉
Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.
Categorization of text comments
Hello Rapidminer Forum
I am doing a smaller project at the university, and I am trying to create a machine learning model to predict categories of a number of text lines.
I have 72 lines of text and I have manually categorized 16 of them into one of two categories (Travelling or Cricket). (The used excel-sheet is attached and a screenshot of it is seen in picture 'ScreenshoutofExcelData'.)
Now trying to make the model predict the rest based on my own categorization. If not possible for some of the text lines it should predict "unknown".
I run into a problem with the SVM (Support Vector Machine) operator giving me error "Insufficient capability" when i put in more than one of the categories in the Filter Examples Operator.
The model used is based on a video from RapidMiner Academy, named: 'Applying a Model to categorize Documents'. Sorry - I am not able to post the link - but a screenshot from the webpage is seen in below picture named 'ScreenshoutofVideoPage'.
Screenshots of the model are shown in pictures: 'Model1_Part1', 'InsideSubProcess', 'InsideProcessDocuments', 'InsideTraining', and 'Model2_Part2'.
I also found an article from a website called 'Monkey Learner', which is attached as a pdf named: 'What is Text Classification?'. On page 3 to 9, it goes through six steps, which is basically what I want to do in RapidMiner, if you have any suggestions to create such a model, please help.
Thanks for taking you time to read and maybe even answer me.