# "Calculate performance only on TRUE cases??"

Hello,

First off, I want to say thank you for this great software. I LOVE RapidMiner!!!

On to my question...

We are looking at creating an SVM for detecting positive indications of a medical condition.

We have training data that is labled "true" and "false" along with all the features. (True examples are those where the person has the medical condition. They represent about 20% of the training data.)

When attempting a grid parameter function or a feature selection function we are seeing a problem with finding an ideal result.

WE DON'T CARE ABOUT THE NEGATIVE OR "FALSE" CASES. We only care about the accuracy of the "true" cases.

The problem is that the accuracy performance measure is the average of accuracy for BOTH cases (true and false.) For example, if we just predict everything as false, since 80% of of our examples are false, then we automatically have 40% accuracy, but ZERO correct predictions for the class we care about.

*** I guess what we ultimately want to do is train a SINGLE CLASS SVM that is focused on predicting the true class as accurately as possible. ****

So we don't need a performance scored based on the aggregate accuracy of the model, but ONLY ON THE ACCURACY OF THE "TRUE" PREDICTIONS.

One thought was to use class weighting in either the SVM or classification performance steps, but how much? and which to use?

Another thought was to use some creative application of the meta-cost function, but how would we incorporate that with the libsvm function??

Is this possible in RM?

Any and all ideas would be appreciated.

First off, I want to say thank you for this great software. I LOVE RapidMiner!!!

On to my question...

We are looking at creating an SVM for detecting positive indications of a medical condition.

We have training data that is labled "true" and "false" along with all the features. (True examples are those where the person has the medical condition. They represent about 20% of the training data.)

When attempting a grid parameter function or a feature selection function we are seeing a problem with finding an ideal result.

WE DON'T CARE ABOUT THE NEGATIVE OR "FALSE" CASES. We only care about the accuracy of the "true" cases.

The problem is that the accuracy performance measure is the average of accuracy for BOTH cases (true and false.) For example, if we just predict everything as false, since 80% of of our examples are false, then we automatically have 40% accuracy, but ZERO correct predictions for the class we care about.

*** I guess what we ultimately want to do is train a SINGLE CLASS SVM that is focused on predicting the true class as accurately as possible. ****

So we don't need a performance scored based on the aggregate accuracy of the model, but ONLY ON THE ACCURACY OF THE "TRUE" PREDICTIONS.

One thought was to use class weighting in either the SVM or classification performance steps, but how much? and which to use?

Another thought was to use some creative application of the meta-cost function, but how would we incorporate that with the libsvm function??

Is this possible in RM?

Any and all ideas would be appreciated.

Tagged:

0

## Answers

82Guruwhat you need is the Operator CostEvaluator in category Validation --> Performance. This operator allows you to specify (mis)classification costs for every possible true class - predicted class combination.

Greeting,

Michael

26MavenThe CostEvaluator operator is very useful for measuring the overall success of the model, but it doesn't help train the SVM to focus on finding more positive cases.

Ideally, it might be good to use the grid parameter operator to test multiple weights of the true class in the svm training to find the optimal setting. However it appears that I can't control that from the grid parameter operator.

I also have no idea what costs to assign in the cost evaluator. Should they be in the range of -1 to 1, or perhaps -100 to 100 ???

Thanks!!!!

82Guru26MavenThat's a GREAT trick!!! That will be useful for many projects

I still see a problem:

80% of our training examples are false and 20% are true. We are training an SVM to accurately find true cases.

The problem is that ALL of the learning classifiers look at "accuracy" or "precision" as a measure of their strength. Especially when using some kind of grid. The accuracy is the measure of which features or parameter combination is the "best".

Our problem comes from the unbalanced nature of our data. If RM just predicts EVERY case as false, then we have an automatic accuracy of 40%. (80% false, 0% true averages out to 40% accurate.) Now if the SVM attempts to predict some values as true, it may or may not have success, but initially it is less than 40%. SO, THE MODEL THAT PREDICTS EVERY EXAMPLE AS FALSE GENERALLY WINS. Clearly this is not what we want!!

One situation that just came up was that the SVM predicted 10 examples as true and 9,990 as false (out of 10,000 training examples.) It was 100% accurate for the true examples and about 80% accurate for the false examples. This averaged to an accuracy of 90%. again, clearly not what we want even though the performance measure was very high.

Ideally, what I want is a way to ask RM to train and ONLY evaluate performance based on accuracy of the true examples. I don't care if it has a 99% failure in predicting false cases. I just want the best percentage I can in predicting true cases. \

(I've read some papers where the researches wrote their own SVM in C++ and built it to focus on correct true prediction. Almost like a one class SVM, but with some negative examples.)

Is there some clever way to do this in RM??

347MavenWhy dont you tell the GridParameterOptimization to use another performance measure ? Since you have a binary classification problem you could use the operator

BinominalClassificationPerformance, especially AUC and Lift. These measures focus on the quality of prediction of the positive class.regards,

Steffen

26MavenNow my classes are labelled "sick" and "not_sick".

Do I need to change that with some operator to make it a true "binary" problem?

How do I tell RM which is the positive class?

Thanks!!!!

439MavenCheers,

Simon

26MavenI can't seem to find this operator. Where is it?

Thank You

439MavenCheers,

Simon

26MavenI just builit a setup with grid parameter for both C and the weight of the positive class. It will probably take several hours to run, but I'm very curious to see what the results will be!!!

Thanks again!!

82GuruIf this is what you really want -- just predict the true cases and ignore the false cases -- than I would just predict everything as true! 100% accurancy for true class, 100% error for false class.

I'm nearly sure you have to think of a secondary condition for your model, e.g. not predicting more than 50% positive in total.

Best regards,

Michael

26MavenYour suggestion wouldn't work. Only about 20% of my training examples are true. So if I predict everything as true, then my accuracy for that class is only 20%. What I'm looking for is accuracy of 100% (Never possible in the real world, but the goal is to see how close I can get.)

My point was that I only care about the accuracy of predicting true cases.

439Mavenhttp://en.wikipedia.org/wiki/Accuracy#Accuracy_in_binary_classification

http://en.wikipedia.org/wiki/Precision_(information_retrieval)

http://en.wikipedia.org/wiki/Recall_(information_retrieval)

Cheers,

Simon

26MavenOur goal is to predict the "true" class as well as possible.

So the measure I need is accuracy, but for only the true class

Modifying the formula from wikipedia:

number_of_predicted_true_that_are_correct / total_number_predicted_true

In other words, "Of all the cases predicted try by RM, what percentage of them are actually true.

i.e. If RM predicts that 100 cases are true, but only 42 of them are actually true (correct prediction.) then I would say that we have a 42% accuracy of predicting a sick person with this model.

I hope that I didn't over explain this...

439MavenCheers,

Simon

26MavenAlternatively, would the AUC be a better measure when comparing different models??

(For example, with a grid search for the best value of C in an SVM?)

Remember, even though this is technically a two class problem, I'm really only interested in the best performance for predicting the true class. IT WOULD BE SAFER FOR US TO WRONGLY PREDICT SOME TRUE EXAMPLES AS FALSE. WHAT IS DANGEROUS IS WRONGLY PREDICTING SOME FALSE EXAMPLES AS TRUE. (Giving treatment to someone who isn't sick poses a huge risk.)

Thanks!!

439MavenSince in all your postings you have been describing precision as the desired measure I wonder why you would now switch to AUC. However, this is your choice and purely depends on the domain and your goals. Keep in mind that for evaluating your results it is easily possible to compute both measures.

Best,

Simon

26MavenI'm very sorry. I didn't realize that I was shouting.... (I was using the caps to emphasize a key point, I did not mean to offend you.)

I learned something today. I always though precision looked at both true and false classes. I had no idea it was only for the true class. That helps a lot.

I'm not sure what a better measure of performance would be:

1) AUC

2) Precsion with a threshshold finder beforehand to optimize results

Additionally, one thing that worries me is the number of correct results. For example: I may have 1000 training examples with 200 that are true. If the model only predicts 5 as true, but correctly, then the precision would be 100%. Unfortunately, a model that only found 5 out of 200 wouldn't be very good for practical uses.

So, I need some combination of precision combined with "volume" or something like that. Is there such a thing?

Thanks again for all the help, we really appreciate it over here!!!

439MavenHowever, if you are unhappy with all of them, you can still construct your own, e.g. by using an AttributeConstruction in combination with an Aggregation. Alternatively, go with the BinominalClassificationPerformance, log the values of false positive etc. using a ProcessLog, turn it into an ExampleSet and use an AttributeConstruction on this one, and turn it back into a PerformanceVector using Data2Performance.

Hope this helps,

Simon

26MavenYou are correct: The f-measure is exactly what I am thinking of.

Interestingly (at least to me) is that over repeated trials with different parameter setting, the f-measure and the AUC seems to always correlate perfectly. I guess this makes since since they are both, in effect, measuring the performance of the model.

Thank You!!!!

-B