Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Complex Data Preparation
Hello everyone,
First of all, it's important to say that I've been following this forum for some time now, and it helped me a lot – so thank you!
Now it's finally my turn to ask for help, and I really hope you could help me out
I'm working on a project requiring machine-learning, currently using a SVM model in Weka, while the data preparation is done by code.
Now I am tasked with transferring all the coded data preparations into RM, but I'm having difficulties with it.
I'll try to simplify the problem.
Let's say we are trying to predict which students will be suitable for the high school basketball team, using the age and height as attributes.
Basically I'm creating features for SVM using every combination of the attributes, in this case using two (in reality I'm currently up to four attributes, possibly more to come…)
2. Is it possible to define const-arrays in RM? Now I'm using additional exampleSets as arrays…
3. Should I even use RM for this kind of data preparation? Or the best practice is to do it by other means, and import the result into RM for further use (i.e. classification and regression)
4. I would be really grateful if someone could give a RM example for the above basketball data preparation
Thanks in advance!!
First of all, it's important to say that I've been following this forum for some time now, and it helped me a lot – so thank you!
Now it's finally my turn to ask for help, and I really hope you could help me out
I'm working on a project requiring machine-learning, currently using a SVM model in Weka, while the data preparation is done by code.
Now I am tasked with transferring all the coded data preparations into RM, but I'm having difficulties with it.
I'll try to simplify the problem.
Let's say we are trying to predict which students will be suitable for the high school basketball team, using the age and height as attributes.
Basically I'm creating features for SVM using every combination of the attributes, in this case using two (in reality I'm currently up to four attributes, possibly more to come…)
1. I've tried all of the Loop operators to create nested loops, but the process became extremely cumbersome and eventually did not work.
foreach student : exampleSet // from repository
foreach age : constAgeArray // [8, 9, 10]
foreach height : constHeightArray // [130, 135, 140]
if (student.age < age && student.height > height)
// set feature BASKETBALL_POTENTIAL_{age}_{height} = 1
else
// set feature BASKETBALL_POTENTIAL_{age}_{height} = 0
2. Is it possible to define const-arrays in RM? Now I'm using additional exampleSets as arrays…
3. Should I even use RM for this kind of data preparation? Or the best practice is to do it by other means, and import the result into RM for further use (i.e. classification and regression)
4. I would be really grateful if someone could give a RM example for the above basketball data preparation
Thanks in advance!!
0
Answers
I would really appreciate your help.
You can probably do this via existing operators, however I think the process would be quite complex.
In this case I'd actually recommend the "Execute Script" operator (unless you want to run this on the Server often). I have created a small example on how this could look: Input data: Result: Regards,
Marco
I'll check this as soon as I get to work.