handling duplicated columns, but with text !


hello there fellow miners,
I'm a Rapidminer beginner, and I am trying to detect then delete duplicated columns for an example set that holds text rather than numbers.
with numbers it was easy, removing correlation did the job perfectly.
but things got complicated with text, is there a way where I can either a) do something similar to the correlation removal in numbers or b) convert the text to numbers but keep the columns intact rather than splitting them by value like the output of the process "Nominal to Numerical" ?
thank you.
I'm a Rapidminer beginner, and I am trying to detect then delete duplicated columns for an example set that holds text rather than numbers.
with numbers it was easy, removing correlation did the job perfectly.
but things got complicated with text, is there a way where I can either a) do something similar to the correlation removal in numbers or b) convert the text to numbers but keep the columns intact rather than splitting them by value like the output of the process "Nominal to Numerical" ?
thank you.

Tagged:
0
Answers
Dortmund, Germany
If I have a table with the following text content :
1 2 3
-------------------------------------
l Tree l Fruit l Tree l
l Fruit l Fruit l Fruit l
l Fruit l True l Fruit l
l Tree l Tree l Tree l
-------------------------------------
I need to remove column 3 or know that column 3 is the exact duplicate of column 1,
thank you all for your wisdom.
You can use the below process