[SOLVED] Pivoting several lines into one single field
Hi guys, this is my first post in this forum, I hope it´s on the right spot.
I´m using currently RapidMiner mainly for ETL processes, and it´s truly impressive what RM is capable of. I´m now looking for a particluar feature which I could not find, and I´d appreciate somebody could put me on the right track. Let me make a very simple example.
Let´s say there are currently two columns, Column 1 shows several cheque numbers, column two shows which invoices were paid with each cheque. A cheque can pay one or several invoices (i. e. a typical payment allocation table).
It´s very important for me that all paid invoices are contained in one single field per record because I want to use it on a report (a statement, to be precise). Unfortunately, I wasn´t able to figure it out myself. Any help would be greatly appreciated.
Frankly, something really hilarious just happened! Based on your answer I wanted to comment that I believe this functionality should be built into the "aggregate" functionality. I wanted to use the correct terminology, so I looked up the operator in RapidMiner, and - shame on me - I actually figured that it is already there, it´s the aggregation function "concatenation".
That´s precisely what I was looking for, and SO simple!
Again Andrew, I do thank you for your answer, after all you pointed me in the right direction, and you also gave me a glimpse of a more flexible approach. (Not that I could do that (yet), but down the road... who knows? )
Re: [SOLVED] Pivoting several lines into one single field
"|" is simply a field delimiter, and RM decided to choose one that usually does not occur in any other context, which is a very good thing in order to keep the different values (Inv1, Inv2 etc. in my example) clearly separated.
It's very easy to change those, simply do the following:
- After the aggregate function, add a "Generate Attribute" operator - "Attribute name" is the field you just generated, e.g. "MyListInOneField" - "function expression" is "replace(MyListInOneField","|",",")
This replaces all occurrences of "|" with "," in the field "MyListInOneField".
In my example, I want one field that reads "Inv. # 4, 5, 6", here's how I do that:
- My initial "ListOfInvoices" is "4|5|6" -> replace(ListOfInvoices,"|",", ") -> "4, 5, 6"
- To add "Inv. # " I use the "concat" function: concat("Inv. # ", replace(ListOfInvoices,"|",", ")) -> "Inv. # 4, 5, 6"