Options

# GA driven attribute selection according to Positive predictive value

RapidMiner Certified Expert, Member Posts: 74 Guru
In my case I am interested only in POSITIVE PREDICTIVE VALUE.

The problem is when I am  selecting attributes  - GA selects only the case with a single correctly classified example - and thus PPV = 100 %. This is of course of a very little  reliability.

Could anyone help me which performance evaluator will fit my needs?
Thank you in advance for any help.

## Answers

• Options
RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
Hi,
sorry, but I'm a bit confused. What exactly are you going to do? Which operator do you use?

Greetings,
Sebastian
• Options
RapidMiner Certified Expert, Member Posts: 74 Guru
I am sorry. I will try to be more clear now.

I have binominal classification problem and what I am interested in is to maximize positive predictive value (PPV) . Therefore lets say I got these confusion matrices:
`1068	32857	77`
accuracy: 74.84%
PPV: 57.46 %
This is quite good as the PPV is of 57.46 %

Lat have a look at this example:
`1135	3940	1`
accuracy: 74.25%
PPV: 100.0 %

Here the PPV is 100 % (i.e. the perfect solution from the point of PPV view and the first is considered to be better)

Unfortunately - one sample positively classified is only of a little significance. There is high probability that when deployed on validation data the results will be very bad.
Results on a validation example set is 1) PPV: 55.9% 2) PPV: 0.0 % (two misclassified samples).

And here is my question - is there any solution how to objectively compare these two results?

Thanks in advance.
• Options
Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
Hello,

hmm, there is actually always a risk in concentrating on precision alone. Beside taking other measures into account, be it by a combination like f-measure, be it by weighting or be it by multi-objective optimization schemes (which is all possible within RapidMiner), I am afraid there is no general solution for a objective comparison.

Cheers,
Ingo
Sign In or Register to comment.