"Regress SVM with numeric and nominal?"

noah977noah977 Member Posts: 32 Maven
edited May 2019 in Help

I want to build an Regression SVM that will train on a "score".  The problem is that the data has both numeric and nominal data. 

For example:

Age: numeric
Favorite Color: (red,green,blue)  nominal
Favorite Food: (meat, chicken,fish) nominal
Weight: numeric
Calories per day: numeric
Postal Code: (90026, 90028, etc.)  Looks numeric but really is nominal

The actual data has about 30 features of which about 15 are nominal and 15 are numeric.

Any ideas on how to build the proper data set and model for a regression SVM?



  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    your problem is to transfer nominal values into numerical ones. You could use nominal2numeric, but I think it would be better to binaryze it first. This means, every nominal value of a nominal attribute becomes a column: favourite color = red and favourite color = green ... and so on. The cell will contain a true if the nominal value was the associated value and false otherwise.
    You then could translate this by nominal2numeric into numerical values processing with the svm
    This method prevents you to put in some ordinal information, by associating colors with numbers (green = 0, red = 1, blue = 2) which aren't simply in the data.

Sign In or Register to comment.