Options

model application/testing problem [SOLVED]

BellaBella Member Posts: 4 Contributor I
edited November 2018 in Help
Hello,
I have a problem with model application.
I am using Rapid Miner 5.3, I am using decision tree as a model. I have already trained the data set on labeled data and saved the model. What I am trying to do is to run the model on unlabeled data. Unfortunately I got this error message for several attributes (the one occurring in my decision tree)

Tree: The internal nominal mappings are not the same between training and application for attribute ...

What else could be done to fix the problem?

Thanks a lot for the advice
Regards
Bella

Answers

  • Options
    awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    Hello

    It could be that the test data contains an attribute with a nominal value that has not been seen during the training phase.

    In other words if there is an attribute called "colour" with values "red", "green" and "blue" in the training phase, whilst during the test phase a new value like "purple" is seen, the model does not know how to treat this.

    If that is the problem then you would have to pre-process the test data to check that it was correct and usable by the model. I could imagine using the training data to drive various filtering activities and any left over examples in the test data would have to be flagged. In fact, I think I will create a video about it.

    regards

    Andrew
  • Options
    BellaBella Member Posts: 4 Contributor I
    Hi,
    thanks a lot for the answer. Actually I am testing the model on the same data that it has been trained. Just to see if my classification is good indeed (although the performance, accuracy value was 81%). But in the future I might have to test it on really unlabeled data, that's why I wanted to check if it works on the same (but without a label attribute).

    But as during the model creation phase rapid miner shares the data on training and testing part. maybe you are right that something new occurred. Could it?How can I check how the program splits the sample on training and testing part?How to control this?And could this caused the problem?

    And if the problem is in not taking all attribute values into account is my classification really good then?:(

    I have checked that the range of attribute values is the same in testing and training set. But maybe it is another question?

    Thanks for your opinion
    Bella
  • Options
    awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    Thinking about it a bit more, the problem I describe isn't what you are suffering from.

    Do you have binominal attributes in your data?

    Andrew
  • Options
    BellaBella Member Posts: 4 Contributor I
    Hello,
    most of the attributes are nominal. But it seems I solved the problem at least I am able to avoid it :). In the higher Rapid Miner version (6.0) I do not get any error and I am able to predict the labels :).

    I have read more posts on the forum and based on this I considered trying the process in RM 6.0, as many people reported about some bugs in 5.3 version. So that's my solution :).

    Nevertheless thanks afor your advice and help :)

    Best regards

    Bella
Sign In or Register to comment.