The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
Breaking an attribute into further attributes
arsalan_karim
Member Posts: 14 Contributor II
Hi All
I am stuck with something that is giving me nightmares.
I have a medical data set that consists of a list of medications which look similar to the below:
FENTANYL | 1 Patch(es), Q3D, 6 Mth30 |
FENTANYL | 1 Patch(es), Q3D, 36 Day(s) |
FENTANYL | 1 Patch(es), Q2D, 100 Day(s) |
FENTANYL | 1 Patch(es), Q3D, 9 Day(s) |
FENTANYL | 1 Patch(es), Q3D, 9 Day(s) |
FENTANYL | 1 Patch(es), 2x/week, 30 Day(s) |
FENTANYL | 1 Patch(es), Q3D, 30 Day(s) |
FENTANYL | 1 Patch(es), 2x/week, 100 Day(s) |
the second Column consists of :the dose, unit of measure, frequency and duration all in one string. How can I break this attribute into the 4 seperate attributes as below:
Dose = 1 |
Unit of Measure = Patch |
Frequency = Q2D |
Duration = 9 Days |
Thanks
Arsalan
0
Answers
Have you tried the Split operator and split them on the comma? If that doesn/t work you could use the RegEx function on the Split operator to do it. Bit more complicated but doable.
Thanks T-Bone
Some of my enteries in the list are without a comma. Is there anyway for that...?
Thanks
Arsalan
Ok then you'll have to use RegEx something like .*\W.*,.*
Or you could just use two sequential Split operators. From your examples it looks like dose is separated by a space but the others are done via commas. So you could first split on comma and then take the first attribute generated (which should contain both dose and unit of measure) and split again on space. That might be less elegant than the regex method but more robust if you have more variations of delimiters.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
If you do that, you just have to be diligent with renaming the attribute columns. I did one once with 4 Split operators and went nuts.