Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Newbie Needs Direction
New to this. Within my data set there are subsets of rows defined by a unique ID. Each ID represents an independent event. How do I set up a scenario that first treats each ID independently and then apply models accross the events?
Tagged:
0
Best Answer
-
Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
Correct, but you can easily create that using the "Generate ID" operator first, which will assign a unique id to every row, and then run the Pivot operator after that. And your problem should be solved!
0
Answers
Hello @dhc and welcome! It would probably help if you post a small example of your data to make sure we are interpreting your explanation properly and to understand the specific structure of your data. In general, it sounds like what you want to do is either pivot the data (use the "Pivot" operator) so you take multiple sub-events and put them together into a single row based on the unique id for the event and keep all the detailed data associated with each sub-event in separate variables. Or if they are all numeric attributes and you want to take only certain formulations such as the sum or average or count, then you can do that via the "Aggregate" operator. Either way, you will end up with a dataset that has only as many rows as you have unique event ids, and at that point you should be able to apply standard modeling techniques. Don't forget for supervised learning that you'll need to define your label (outcome) variable using the "Set Role" operator as well.
@stevefarr you may want to move this to the product help section rather than community news.
Best,
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
Yes I agree i should have started in other topic - how do I move?
Brian - thanks. Here is screen shot (doesn't show the label attribute.)
Im mining horse racing data…. Each value in column A represents a race, so the remaining attributes are relevant in the context of that race only.
.
I just explored the Pivot operator - looks like I need a uniqed identifier within each group - correct?
Thanks @Telcontar
And may I add my wlecome here too @dhc
The results were unwieldy. The ID's need to be seeded with 1 for each "primary key", I added an attribute that serves the purpose. Not sure pivot is way to go. Anyway - thanks for help. I'll keep trying