🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉
Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.
Replicating Excel Pivot Table features
Often I use excel to make a deeper inspection of my process results such as inspecting cluster composition based on the attributes, or inspecting GLM predictions on a scoring set. E.g. What is the distribution in each cluster for different cities of City attribute, or how are the predicted leads distributed across the various business Categories and Region attributes, etc. And I need to do this back and forth quite a few number of times until I'm satisfied that the scored data looks very similar to training data, or that my clusters are distinct enough. As a habit, I do not blindly trust an algorithm or a computer program.
I have used the pivot operator in RM but unlike excel, first of all it does not give me totals for rows or columns of the pivot table. Secondly, I'm not sure how to go about changing the numbers to percentage of row/column total or percentage of parent row/column total, etc. on the fly like in excel. I have been able to adjust the significant digits after a decimal point, but that's about it. Also the index attribute defined in the Pivot operator does not get passed to the next set of operators. E.g. If the index attribute was cluster, and the pivot table contained attributes cluster_0, cluster_1, 2, 3, etc... they do not get passed to the next operator - say, generate aggregation, or select attributes, etc. I have to manually type them in.
The reason I am looking for a solution inside rapidminer is to avoid having to do so much back and forth and keep recreating the same pivots over and over again with just different data values. If I can build the whole thing inside RM, I just need to hit play everytime and stretch my arms and relax until it finishes and dumps all pivot tables and other outputs onto the results window. Thank you very much.