turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Community Home
- :
- Product Help
- :
- Getting Started Forum
- :
- Re: polynomial classfication by using SVM

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic to the Top
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

12-29-2016 11:43 AM

8 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

12-29-2016 11:55 AM

12-29-2016 11:55 AM

Hi and welcome to our community,

The message means that you try to make a prediction for a label (or "target") with more than two categorical values (which is called "polynominal" in RapidMiner). And the SVM you are using is not supporting this type of data. Try the operator "SVM (LibSVM)" instead which can handle this.

You can check what types of data is supported by a machine learning model if you right click on the operator and select "Operator Info". You will see a table describing the supported data types.

Another useful resource is the following web page: http://mod.rapidminer.com

Here you can make settings describing your data and it will show you the model types which can be used on that data.

Hope that helps,

Ingo

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

12-29-2016 01:10 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

12-29-2016 01:44 PM

12-29-2016 01:44 PM

Hi,

This is correct. Logistic regression can only do binominal classification (i.e. for two classes only). BUT you can always embed any binominal learner into the ensemble operator "Polynominal by Binominal Classification" which turns the polynominal classification problem into a set of binominal classification problems following either a 1-vs-1 or a 1-vs-all strategy.

Below is a process which shows you how to do that.

Hope this helps,

Ingo

<?xml version="1.0" encoding="UTF-8"?><process version="7.3.001"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="7.3.001" expanded="true" name="Process"> <process expanded="true"> <operator activated="true" class="retrieve" compatibility="7.3.001" expanded="true" height="68" name="Retrieve Titanic Training" width="90" x="45" y="34"> <parameter key="repository_entry" value="//Samples/data/Titanic Training"/> </operator> <operator activated="true" class="polynomial_by_binomial_classification" compatibility="7.3.001" expanded="true" height="82" name="Polynominal by Binominal Classification" width="90" x="179" y="34"> <process expanded="true"> <operator activated="true" class="h2o:logistic_regression" compatibility="7.3.000" expanded="true" height="103" name="Logistic Regression" width="90" x="45" y="34"/> <connect from_port="training set" to_op="Logistic Regression" to_port="training set"/> <connect from_op="Logistic Regression" from_port="model" to_port="model"/> <portSpacing port="source_training set" spacing="0"/> <portSpacing port="sink_model" spacing="0"/> </process> </operator> <connect from_op="Retrieve Titanic Training" from_port="output" to_op="Polynominal by Binominal Classification" to_port="training set"/> <connect from_op="Polynominal by Binominal Classification" from_port="model" to_port="result 1"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator> </process>

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

12-30-2016 09:45 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

12-30-2016 12:43 PM

12-30-2016 12:43 PM

Ingo,

A similar question came up with Sample(Bootstrapping). There seems no way I can define different multipliers to different classes. For example, class1 has 10 data points and class 2 has 5 data points. I want to duplicate the class 2 data points and make the total number to be 10, which is the same as class1. I cannot use Sample(Bootstrapping). I don't want to down sample by just using Sample operating with ratio parameter because the number of data is already very small, i.e. 10. I need to fully use all the data. Is there any other operator available? Or I can manually duplicate class 2. Thanks!

Highlighted
Options
## Re: polynomial classfication by using SVM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

01-09-2017 08:33 AM

01-09-2017 08:33 AM

@Shagu you should consider using the "generate weight" operator instead, which will generate weights to balance the classes, and does not discard any data. It is roughly equivalent to duplicating under-represented examples but not as messy. You just have to check that whatever learning algorithm you are using is able to handle weighted examples. Unfortunately the native RapidMiner logistic regression operator does not, but the very similar logistic regression operator from the Weka extension does. (You can check this by pressing F1 when selecting any learning operator in your process).

Brian T., **Lindon Ventures** - www.lindonventures.com

Analytics Consulting by Certified RapidMiner Analysts

Analytics Consulting by Certified RapidMiner Analysts

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

01-10-2017 09:17 AM

01-10-2017 09:17 AM

Thank you, Telcontar120. Since I am working in an engineering field, where data are rather limited than financial and insurance areas, due to the fact that every data is costly. I feel Naive Bayesian is the best model, because it is simple and stable when the number of data is small. Is this just my intuition? Or is there any mathematical theory behind it. Thanks again!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

01-10-2017 10:08 AM

01-10-2017 10:08 AM

I am not really a statistical theoretician, so I can't say for sure. My experience is that determining which learning algorithm works best is highly contextual based on the dataset you are working with. Regardless of the specific algorithm chosen, using standard model validation approaches such as cross-validation will be an important part of ensuring that your final model is robust. Also choosing a simpler final model will generally help it to be more robust over time.

Brian T., **Lindon Ventures** - www.lindonventures.com

Analytics Consulting by Certified RapidMiner Analysts

Analytics Consulting by Certified RapidMiner Analysts