RAPIDMINER 9.7 BETA ANNOUNCEMENT

The beta program for the RapidMiner 9.7 release is now available. Lots of amazing new improvements including true version control!

CLICK HERE TO DOWNLOAD

Replace (dictionary)

abevenseeabevensee Member Posts: 4 Contributor I
edited November 2019 in Help
I have a data set that has 30+ attributes. Each data row has numerical codes in each column that correlate to a classification. For example; Gender is an attribute with codes 1-3 meaning Male, Female, and not provided, respectively. There are similar code structures for ethnicity, race, etc. I have set up a dictionary for each one of these attributes so that my model can reference to the specific dictionary and convert the codes to meaningful data. I have 2 questions:

1):  Codes mean something different for each attribute conversion I'm performing so I set up separate dictionaries for each. For instance 1 means male for gender but it also means white for race and single for marital status. Is there a way to use the loop operator to have RM run all 30+ conversions using the different dictionaries or do I need to have 30 separate "Replace (Dictionary)" operators in my process?

2):  In some dictionaries there are layered codes, for instance in my use case
1   = Latino/Hispanic
4   = N/A
14 = Other Hispanic or Latino

Instead of returning "Other Hispanic or Latino" for codes that equal 14, the operator is returning "Latino/HispanicN/A". I've seen that the regular expression option could prevent this however since I have the operator set up to run on a subset (the various ethnicity related attributes) and I do not want it applied to the whole population, I'm not sure that'd work. How can I go about fixing this?

Best Answer

Sign In or Register to comment.