The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
[SOLVED] How to convert nominal to numeric value? Urgent..Please have a look
Anuj_Gupta
Member Posts: 3 Contributor I
Hey Folks,
I am struggling with one query since long. I have nominal data, I want to convert it into
numerical data. I got to know from rapidminer that there are two ways which are as follows
:
1. Using nominal to numeric operator.
2. Creating dummy variables (0 or 1), using nominal to binominal operator.
But for first method, my boss says it (nominal to numeric operator) is not correct method
and he is not at all happy with this operator. Can anybody suggest me that, it is correct or
not so that i can convince to my mentor.
While using second operator, I tried with this but doing this numbers of variables become
too high (as categories are more) then it leads to memory error.
So Can anybody guide or suggest me some other alternatives so that I can convert nominal
to numeric value.
Thanks for your timings and seeking for your valuable suggestions.
I am struggling with one query since long. I have nominal data, I want to convert it into
numerical data. I got to know from rapidminer that there are two ways which are as follows
:
1. Using nominal to numeric operator.
2. Creating dummy variables (0 or 1), using nominal to binominal operator.
But for first method, my boss says it (nominal to numeric operator) is not correct method
and he is not at all happy with this operator. Can anybody suggest me that, it is correct or
not so that i can convince to my mentor.
While using second operator, I tried with this but doing this numbers of variables become
too high (as categories are more) then it leads to memory error.
So Can anybody guide or suggest me some other alternatives so that I can convert nominal
to numeric value.
Thanks for your timings and seeking for your valuable suggestions.
0
Answers
Well, if the usage of the nominal to numeric operator is a problem or not depends a bit on what are you doing on which data. Indeed, the operator simply produces numbers based on the internal mapping used by RapidMiner. If you produce those numbers for two data sets with different mappings, those numbers would also differ. You can deal with this by ensuring that the same internal mapping is used for all data sets.
But still even then the internal mappings don't have any real meaning. For example, if you have the three nominal values "low", "medium", "high", you would probably would not like to end up with the numbers "2", "1", and "3" but would prefer at least something like "1", "2", and "3" instead. But even this might become problematic: Is "high" really exactly 1 more than "medium" compared to "medium" to "low". Who knows?
For both reasons (especially the second one since the first one can be dealt with if you are cautious) I would agree with your boss that method 2 should usually be preferred. If memory is getting low, you could try to create a view instead which calculates the values on the fly instead of directly calculating and storing them. If this still does not work, you could use method 3 introduced by me above so that at least both problems discussed above will be smaller.
Cheers,
Ingo