# Pivoting a table using Unique/Distinct Values

I have got data that looks like this table(I have simplified the table, but the idea is to get the number of unique product ids form unique columns and insert discount id where customer id meets product id, otherwise it will be a missing value(?)):

CustomerID | Product | Discount ID |
---|---|---|

Customer1 | Product1 | D1 |

Customer1 | Product2 | D3 |

Customer2 | Product1 | D2 |

Customer3 | Product3 | D3 |

I would like to turn this into this form:

CustomerID | Product1 | Product2 | Product3 |
---|---|---|---|

Customer1 | D1 | ? | ? |

Customer1 | ? | D3 | ? |

Customer2 | D2 | ? | ? |

Customer3 | ? | ? | D3 |

I have played around with the PIVOT operator but I am struggling to achieve this.

Thanks very much in advance!

0

## Answers

1,761UnicornSomething like this?

26MavenThanks T-Bone, not exactly in that format(see 2 rows for Customer1 in original post), thats how far I got using PIVOT operator, however in my case I'd need Customer1 to have 2 rows/examples; one for Product1 with with a value D1 while the rest are missing values and the second example with a value of D3 on Product2 with the rest as missing values. It is this bit of not being able to GROUP BY CustomerID which is making the whole thing complex.

I appreciate your efforts, thanks.

1,635UnicornWell @Chiko, this was an interesting problem. For starters, as already shown, it is quite easy to turn your original dataset into the following table structure, using just the pivot operator and then doing some renaming:

CustomerID Product1 Product2 Product3

Customer1 D1 D3 ?

Customer2 D2 ? ?

Customer3 ? ? D3

This has one row per customer id, and separate attributes for each product discount combination. But to get to the next version, to split it into separate rows for each customer id & product id combination, required several additional steps. The attached process file should do it for you (the initial read csv operator simply starts with your original data provided in the first format), yielding a final table with the structure you want.

However, this was a non-trivial exercise, and if you had a lot of different products, you would need to do more work to build it out. If I were you, I would think about whether the structure you are looking for is really necessary compared to the 2nd version above, which is quite simple to produce.

I hope this helps!

Lindon Ventures

Data Science Consulting from Certified RapidMiner Experts

26MavenThanks very much Telcontar120. Yes this is a hard one. My product run into thousands. I think I will prep the data outside RapidMiner using dynamic SQL.

Thank you very much!