How to dynamically select columns based on filter of rows ?
I've data like a pivot table where Countries (India, China, Japan, Singapore, etc.) on rows, Sales reps in respective countries on columns, the amount of sale they have done on the values(intersection of rows and columns). When I do it in excel as a pivot, when I filter for India(Filter Rows/Examples), both columns(sales reps in India) and rows(India) will be filtered.
Attached some sample data for reference. Can I replicate the same using Rapidminer ? Can someone please help ?
Did you check out the Aggregate and Pivot operators?
With Aggregate you can group by Region and Sales Rep, then Sum or Count, etc. You can then Pivot after that.
Thanks for your reply. I'm working on a dataset similar to the above, where I've 90 % categorical and 10% numerical. Hence, aggreagate doesn't work.
Little brief about the data :
The data is coming from different sensors, each sensor captures specific info. I've used the union operator to get the info in this format. Now, I've 278 attributes with 12500 records in which 240 attributes are categorical and rest numerical. Now, I'm trying to select particular sensors to see any correlation/dependency with other attributes and do exploratory analysis.
Is there any way, I can get only the columns related to sensors I filter ?
P.S.: PGN-ID is sensor name here.
Yes, in the Select Attribute operator you can do that by Subset or Regular Expression. If the sensor columns always start with "PGN" then you can just use regular expressions and do "PGN.*" (without the quotes).