Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

model application/testing problem [SOLVED]

BellaBella Member Posts: 4 Contributor I
edited November 2018 in Help
Hello,
I have a problem with model application.
I am using Rapid Miner 5.3, I am using decision tree as a model. I have already trained the data set on labeled data and saved the model. What I am trying to do is to run the model on unlabeled data. Unfortunately I got this error message for several attributes (the one occurring in my decision tree)

Tree: The internal nominal mappings are not the same between training and application for attribute ...

What else could be done to fix the problem?

Thanks a lot for the advice
Regards
Bella

Answers

  • awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    Hello

    It could be that the test data contains an attribute with a nominal value that has not been seen during the training phase.

    In other words if there is an attribute called "colour" with values "red", "green" and "blue" in the training phase, whilst during the test phase a new value like "purple" is seen, the model does not know how to treat this.

    If that is the problem then you would have to pre-process the test data to check that it was correct and usable by the model. I could imagine using the training data to drive various filtering activities and any left over examples in the test data would have to be flagged. In fact, I think I will create a video about it.

    regards

    Andrew
  • BellaBella Member Posts: 4 Contributor I
    Hi,
    thanks a lot for the answer. Actually I am testing the model on the same data that it has been trained. Just to see if my classification is good indeed (although the performance, accuracy value was 81%). But in the future I might have to test it on really unlabeled data, that's why I wanted to check if it works on the same (but without a label attribute).

    But as during the model creation phase rapid miner shares the data on training and testing part. maybe you are right that something new occurred. Could it?How can I check how the program splits the sample on training and testing part?How to control this?And could this caused the problem?

    And if the problem is in not taking all attribute values into account is my classification really good then?:(

    I have checked that the range of attribute values is the same in testing and training set. But maybe it is another question?

    Thanks for your opinion
    Bella
  • awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    Thinking about it a bit more, the problem I describe isn't what you are suffering from.

    Do you have binominal attributes in your data?

    Andrew
  • BellaBella Member Posts: 4 Contributor I
    Hello,
    most of the attributes are nominal. But it seems I solved the problem at least I am able to avoid it :). In the higher Rapid Miner version (6.0) I do not get any error and I am able to predict the labels :).

    I have read more posts on the forum and based on this I considered trying the process in RM 6.0, as many people reported about some bugs in 5.3 version. So that's my solution :).

    Nevertheless thanks afor your advice and help :)

    Best regards

    Bella
Sign In or Register to comment.