RapidMiner 9.7 is Now Available
Lots of amazing new improvements including true version control! Learn more about what's new here.
I am currently playing messing around with statistics to check my validation results. Reading some literature I have a question about ANOVA. Since the operator is part of RM, I assume that it is considered useful.
- Do you agree (with your experience), that the assumption of homogeneous variance can be ignored if the checked sequences have equal length and are approximately equally distributed (same distributions, but differing parameters) ?
- What about Kruskal Wallis ? It may be more conservative (rejecting H0 more often), but since it is rank-based it can be applied to any performance measure without to much trouble (I suppose).
- What about "local testers" like Scheffé or Turkey ? Is their absence in RM a consequence of agreement ("bah. Those are useless") or time ?
My current choice would be the Tukey-Test. ANOVA is (in my current point of view) as useful as a mathematical proof of existence.
many thanks in advance