Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
"How are L pos and L neg set in an SVM?"
Noob here. I am working on a basic classifcation problem (responders vs non-responders), where the cost of missing a responder is much higher than the cost of including a non-responder. Specifically, the average sale is $115 and the average cost to mail somebody is $0.68. We want to bias in favor of maximizing sales over minimizing cost to mail.
I can't find any documentation anywhere that shows how the "L pos" and "L neg" settings in an SVM are actually set. Are they numbers between 0 and 1? Are they relative weights for mis-classifying a positive "L pos" and a negative "L neg"? In my case, I would interpret that as setting "L pos" much higher than "L neg", since I want to ensure that we avoid as many false negatives as possible.
Can somebody help me understand how this works?
Thanks in advance...
Eric
I can't find any documentation anywhere that shows how the "L pos" and "L neg" settings in an SVM are actually set. Are they numbers between 0 and 1? Are they relative weights for mis-classifying a positive "L pos" and a negative "L neg"? In my case, I would interpret that as setting "L pos" much higher than "L neg", since I want to ensure that we avoid as many false negatives as possible.
Can somebody help me understand how this works?
Thanks in advance...
Eric
Tagged:
0
Answers
Additionally, I'm still exploring classification methods for what I'm working on. Is it possible to set similar loss criteria in other classification methods? (False positives aren't too big of a deal, but false negatives are very costly--also, the test set includes very few responders compared to the number of non-responders, so a lot of methods that just look to improve accuracy without weighing false positives against false negatives classify almost everyone as a non-responder.)
i don't know what it is but i got eager to find it out.
So apprently it is used in /src/main/java/com/rapidminer/operator/learner/functions/kernel/jmysvm/svm/SVM.java
I figured out there that the balance cost option sets the same values. So i did a quick test (process XML below). I classified golf with and without balance cost. Further i added weights to ensure class balance this way.
Weights w/o balance cost and w/o class balancing w balance cost but w/o class balancing and finallyw/ scaling but w/o balancing So apprently the balancing here does the same thing than balancing via weights. Thus I expect LPos and LNeg to be class_weights.
Cheers,
Martin
Dortmund, Germany
Here's an example on how it works. Please note, I chose to use Naive Bayes because it demonstrated the extemes well chosing to send to the entire dataset to 'maximise profit' (Have the least cost)
You can chose to either use the Meta Cost operator or use the Performance (Costs) operators separately.
Meta cost builds several models to favour the cost.
Performance (Costs) tells you the expected cost of the model. (I tend to combine this one with an Optimise operator and then build my model parameters with the ones that decrease the cost the most).
~Martin
Dortmund, Germany