How to compare the SVM and random forest results?

Joanneyu · March 2020

Hi,
I am trying to use the prediction in Auto Model but encountered several questions on the results of SVM and random forest.

I wonder why the results of SVM and RF barely match? For example, attribute 1 has the highest weight based on SVM result, but it became one of the attributes having the lowest weight in the RF result.
Why the weights of several attributes are 0 in SVM? Is it possible that the attributes have 0 weight to the model even though I only selected those attributes marked as "green" in input selection? But this did not happen in RF.
Continuing with question 2, I tried to play with the data marked as orange and red in input selection. I discovered that an attribute could have the highest weight in the case that I selected all of them (regardless of green, orange, and red). However, that attribute is actually marked as RED. Why does this happen? In this case, would you suggest me to include all attributes in case there is any important attribute that actually contribues to the model a lot?

Thank you very much! I hope I have explained my questions in detail.

[Deleted User] · March 2020

@Joanneyu

Hello

I think it is because of your data. You can make a process then use cross validation in order to check the data with a new process and cross validation. So according to the result you can understand every thing because in Auto model 60% is for train and 40% is for test.

I hope this helps
mbs

varunm1 · March 2020

Hello @Joanneyu

1. I wonder why the results of SVM and RF barely match? For example, attribute 1 has the highest weight based on SVM result, but it became one of the attributes having the lowest weight in the RF result.

The reason is, they both are different algorithms. If you are looking at the "weights" section of the auto model results, this is calculated using explain predictions operator which takes predictions (correct & wrong) into consideration. These are model specific, yet model agnostic global weights.

Explanation about how these weights are calculated is given below.

"The operator is deriving those weights directly from the explanations. If the true labels are known for the test data, all supporting local explanations add positively to the weights for correct predictions. All contradicting local explanations add positively to the weights for wrong predictions. If the true labels are not known, the global weights only use the supporting local weights instead."

2. Why the weights of several attributes are 0 in SVM? Is it possible that the attributes have 0 weight to the model even though I only selected those attributes marked as "green" in input selection? But this did not happen in RF.

So, the light bulbs (green, red and yellow) are based on a feature selection type called filter methods. This method uses statistical tests like correlation to provide important features on whole training dataset. As explained earlier the weights are calculated based on explain predictions operator and not the regular global feature importances coming from an algorithm or filter methods.

3.Continuing with question 2, I tried to play with the data marked as orange and red in input selection. I discovered that an attribute could have the highest weight in the case that I selected all of them (regardless of green, orange, and red). However, that attribute is actually marked as RED. Why does this happen? In this case, would you suggest me to include all attributes in case there is any important attribute that actually contributes to the model a lot?

Yes, you can include all the attributes. You need to take the performance trade-off when adding these attributes rather than weights. Once you see the performance improvement when adding all variables, you can use that.

Joanneyu · March 2020

@varunm1
Thanks so much for the info! Good to know that it's normal the results of SVM and RF would not be similar. I have 23 different clusters, and I tested 5 of them.

Sometimes the SVM performed better, and sometimes RF. This is understandable; however, I wonder in this case, should I just stick to one algorithm? For example, if SVM performs better in cluster 1 whereas RF performs better in cluster 2, I am not sure if I can compare the result for these two clusters since they are using different algorithm.

varunm1 · March 2020

Hello @Joanneyu

By 23 clusters you mean 23 classes in prediction?

if SVM performs better in cluster 1 whereas RF performs better in cluster 2, I am not sure if I can compare the result for these two clusters since they are using different algorithm.

Can you please inform, what you mean by cluster here? Are these different sets of data?

Joanneyu · March 2020

@varunm1
Sorry for the confusion. I have 23 sets of data, so I will run them separately in Auto Model. But I want to compare the resutls among the different data set.

varunm1 · March 2020

Hello @Joanneyu

Are all these 23 sets different? If so you cannot compare them. If they all belong to same probabilistic distribution, you can use single model based on performance.

Joanneyu · March 2020

@varunm1 Hi, I have the same attrabutes in all the data set. e.g. Data set 1 features mountain scenes, and data set 2 features seascape etc etc... but the variables are the same. So for example, attribute 1 might contribute more to the model in data set 1, but the same attribute might not really have effects on the model in data set 2.

varunm1 · March 2020

@Joanneyu Yep, it can happen. Even though you have the same attributes in all datasets, their data distributions might be different which might be one reason for the difference in attribute weights. As I am not much aware of your data, I cannot inform if it is a good idea to compare models between different datasets. Generally, the comparison happens on the same data or similar data (based on distribution).

Joanneyu · March 2020

Hello again @varunm1,
Back to my previous questions as I am still a bit confused...
1. You mentioned "the light bulbs (green, red and yellow) are based on a feature selection type called filter methods". I wonder if you know the specific metric used by Auto Model (e.g. information gain ratio, gini coefficient etc etc)?

2. I have 14 attributes in total. 12 of them range between 0 to 1, and two of them range between 1 to 100 (but they are mostly between 40-60). If I selected only the "green" one based on those 12 variables, the weights looks quite "normal" in the results (e.g. the weight of 1st attribute is 0.236, the the 2nd attribute is 0.212, following by 0.138, 0.124 etc etc etc). However, whenever I included the latter two attributes (even if I included only one of them), the weights in the results become very little (e.g. the weight of 1st attribute is 0.001, then the rest are all 0.000). I checked if there is any multicollinearity problem with the two attributes, but it doesn't appear so. The only thing I detected is the former 12 attributes all have a medium-high stability, but the latter 2 attributes have a relatively low stability. Is this a potential problem that could "ruin" the data?

Image: https://us.v-cdn.net/6030995/uploads/editor/vn/kswc4od0v1xi.png

I really appreciate your help! Thank you so much.

varunm1 · March 2020

Hello @Joanneyu

For question 1, the color bulbs are based on correlation,missing values, id-ness and stability of an attribute. Once you get that window with light bulbs, just click on "i" symbol located somewhere on top right. You will get detailed information about the criteria used for assigning color bulbs.

For question 2. Generally if you have different scales in attributes it is recommended to normalize data as varying degrees scales impact models. I meed to check if automodel automatically does that.

Now your other question related to changing weights when different attributes are added to green ones. Yes, they will change. I forgot exact term for this, but your model doesn't behave the same way when new attributes are present. This doesn't need to be a multi collinearity problem. Your model can degrade or improve in presence of different attributes. This is one reason we use model based feature selection. You have that option in auto model where model based feature selection is done by automatic feature engineering operator. I guess that option is present in bottom right or top of automodel window where you can select different models.

Joanneyu · March 2020

@varunm1 thank you very much. (I hope this ismy one last questions..)

I have normalized the data. But it seems that the problem still exist.
Could you kindly explain again? Even based on the filter methods, the green bulbs suggest that those attributes at least can influence the model, isn't it? So I would expect the weights would be at least 0.000 something.

However, in the weights results section, why does it happen that in SVM, only the first few attributes has weights. The rest of the attributes have exactly zero weights - which means they do not contribute to the model at all. Does it make sense?

And why this is only in the case in SVM? In random forest, even the attribute ranked at the last contribute a little bit (e.g. weight is 0.000...)

varunm1 · March 2020

Hello @Joanneyu

As informed earlier, the global model weights and local weights are different. So, when you are saying that the weights are zero in SVM, there is no guarantee that these are useless. I recommend you look at SVM coefficients to see the importance of attributes in a global model level. I also want you to check performance metrics to see if the performance is increasing or decreasing when you are adding these attributes. I want to reiterate that the weights what you are looking are local weights. These local weights are calculated based on a correlation based locally interpretable model explanations method (LIME). In this process ,we first generate new random samples around each example in you data and find correlation weights for each example in you data set. Finally, there is an algorithm that add up these weights based on supporting and contradicting attributes to give you final output.

Why its happening to SVM? These weights are model and predictions dependent. So these weights changes based on model used as predictions changes.

If you could share your data, I can run it and explain you. You try reading LIME model explanations. The one used in rapidminer is a variation of it and currently you won't find about it in any research article. We will come up with one.

Joanneyu · March 2020

thank you very much! Very helpful! @varunm1
I have attached my data. I want to predict the engagement rate (EN_total) based on the twelve color + saturation and lightness. More explanations would help a lot!! Thank you so much.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

How to compare the SVM and random forest results?

Answers

Be Safe. Follow precautions and Maintain Social Distancing

Be Safe. Follow precautions and Maintain Social Distancing

Be Safe. Follow precautions and Maintain Social Distancing

Be Safe. Follow precautions and Maintain Social Distancing

Be Safe. Follow precautions and Maintain Social Distancing

Be Safe. Follow precautions and Maintain Social Distancing