NancyNancy Member Posts: 9 Contributor II
edited November 2018 in Help

I have a text document .Is it possible to classify the words in the document on the basis of their dependencies? .I have applied Navie Bayes for classification but I am getting only the graph and parameters of the distribution..

Nancy  :)


  • Options
    RalfKlinkenbergRalfKlinkenberg Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, Member, Unconfirmed, University Professor Posts: 68 RM Founder
    Hi Nancy,

    after using the NaiveBayes operator, the relative distribution plots and the distribution parameters are displayed. If you would like to apply the Naive Bayes model to classifiy documents, you have to use the ModelApplier operator. This operator will add a prediction column to your example set and also columns with the confidence (probabilities) for each class.

    Regarding depencies: Naive Bayes assumes the independence of the attributes (words) and hence does not consider any dependencies. Nevertheless it is a good text classification method and for example used by most e-mail spam filters to distinguish between spam messages and non-spam e-mail messages.

    Other learning techniques can consider dependencies to some extend. Support Vector Machine (SVM) models consider attribute dependencies to some extend and linear SVMs are often very accurate text classifiers. In RapidMiner, you have the choice between several SVM implementations: JMySVM, LibSVM, EvoSVM, and others.

    For evaluating the performance of a modelling technique, you can use a cross-validation, i.e. the XValidation operator.

    For further information, I recommend the RapidMiner Online Tutorial (see "RapidMiner Tutorial" in the RapidMiner Help menu) and our free introductory RapidMiner webinars.

    Best regards,
Sign In or Register to comment.