🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉
Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.
Correlation in classification model - how to sort classes
I have a classification problem to solve. There are 10 classes (1, 2, 3, 4, ... , 10) to be predicted and I want to optimize my model parametres by highest correlation since in real life class 1 should have relatively similar characteristics to class 2 and at the same time very low similarity to class 10.
If I understand correctly in the Performance(Classification) operator correlation is calculated as follows:
Cov(L,P) / sqrt(V(L)*V(P))
where: P=prediction, L=label, V=Variance, Cov=Covariance.
However when I treat label classes 1, 2, 3 etc. as polynominals, RapidMiner gives them quite random integer index (based on which the correlation is later calculated) which I cannot control. Therefore correlation is not calculated properly.
Is there any way to force RapidMiner to treat polynominal label 1 as 1 (index), label 2 as 2 (index) etc.?
Thanks in advance!