Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Some attribute is missing the input from example set
bernardo_pagnon
Member, University Professor Posts: 64 University Professor
in Help
Hello,
I have a simple RM process, and whenever I use the set role operator I get the warning message "The attribute loan_status is missing in the input example set". I can run a model, but the performance operator gives me a confusion matrix full of zeros. How can I fix this? I checked the data and it is binominal, the conversion was well performed.
Regards,
Bernardo
I have a simple RM process, and whenever I use the set role operator I get the warning message "The attribute loan_status is missing in the input example set". I can run a model, but the performance operator gives me a confusion matrix full of zeros. How can I fix this? I checked the data and it is binominal, the conversion was well performed.
Regards,
Bernardo
Tagged:
0
Best Answer
-
rjones13 Member Posts: 200 UnicornHi Bernardo,
No apologies necessary! I've got both now.
Regarding location of Set Role, it is fine as is. The only thing is just to make sure you've got "include special attributes" turned on in Map, as you already have.
In general your process can be simplified a little, and removing the data duplication. This can be done using the unmatched data from the Filter Examples operator as shown in the screenshot below. The top data takes the non-missing examples, then the unmatched gives the examples with missing loan status.
Regarding polynominal to binominal, are you wanting to do this to all variables? There shouldn't be a need to do this to loan status as the software will be able to tell by default that this binominal.
The performance operator gives all zeros as it compares the predicted label to actual. You're scoring data where there is no actual label, hence no performance metric.
Any further questions or clarifications please let me know.
Best,
Roland1
Answers
Usually the warning, would be down to a metadata issue, however looking at your process I'd be surprised if that's your issue.
Regarding your confusion matrix being full of zeros, I couldn't see where you were applying that performance operator? I'd be happy to test if you're able to provide the data?
Best,
Roland
thanks a lot for your reply. I am sorry, it is my bad. I have been changing the process so much that I probably deleted the performance operator. I am attaching the current version, which is giving me the zeros.
What I always struggle with rapidminer is the order of operators. Where do I put the set role? When do I convert polinominal to binomial?
In this case, I am checking and when I add the polinominal to binomial operator I start receiving warning messages. However, when I checked the types of data everything seemed to be fine.
Here is a link to the data:
LoansData_sample.csv
Regards,
Bernardo
Let's see:
1 - Ok with the set role
2 - You are totally right about simplifying things, I will work on that.
3 - The polinominal to binominal, I only did it with the label (loan_status), and I included special attributes.
4 - The software cannot tell the default that loan status is binomial. In the statistics tab, it says polinominal, that is why I changed it.
5 - It is still weird to me that I am getting the zeros in the confusion matrix. If I remove performance, I can see the unknown instances being classified by the apply model operator, so there is an actual label and the code is doing what I want. For some reason, the program is not recognizing it in the performance operator.
6 - I did a new version in which, when I import the file using import data I tell rapidminer that the loan_status is binominal (no need for the binominal to polinominal in this case). I still get zeros in the matrix, despite having each unknown instance classified by the apply model. I gave up... I tried everything I could... I exported the output of apply model and built the matrix by hand in Excel, which is what performance should do.
Best,
Bernardo
I reread your reply, more carefully this time, and I understood what you were saying.
Done, problem solved!
Best,
Bernardo