Different Results using DecisionTree in RM4.4 and RM4.5

thorbenkellerthorbenkeller Member Posts: 1 Contributor I
edited November 2018 in Help
Hi everybody,
I have a Problem with a very simple Decision Tree, that does not return the correct result. I suspect a Bug(?) in the new Version 4.5 since in the older Version I always got the correct result.

What I have is 50.000 examples, where I have two classes (A and B) and a single numerical Attribute (ranging from 0 to 10,5). From those 50.000 examples, around 35.000 have class A and 15.000 have class B. All I want to do is find the best threshold to separate those two classes, i.e. training a binary decision tree with depth 2.

The earlier Version found the threshold to be around 3,5, which results in an overall classification rate of around 90%. The new RapidMiner Version finds the threshold to be around 9,4 resulting in an overall classification rate of around 69%!!

I used the exact same process-file and did not make any changes. Using GridParameterOptimization I checked around 27.000 parameter combinations, but none of them resulted in a classification rate greater than 69%.

Does anybody have a similar problem or could give any help?

Thanks for any Feedback.

PS: Thank you very much for the Development of this great Tool, it helped me a lot in my Diploma Thesis :-)

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Thorben,
    I will check that. We did not change anything in the decision tree, but perhabs there's somewhere a cross-effect...

    It would help us a lot, if you could send a process, where this occurs. Since you probably cannot send uns the data, it would be great, if you could use an ExampleSetGenerator instead.

    Greetings,
      Sebastian
Sign In or Register to comment.