Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
"Clustering with Loops?"
Hi RM Masters!
I am a novice with RM and inexperienced with loops and macros. I need advice on how to structure a process to loop clustering. I am trying to get three centroids - a low / medium / high for each location and illness combination (see below). This will be used so when future data is received about how long a contract from [location B] for [pain] is taking I can tell whether it is taking too long, on track, or ahead of schedule.
I'm pretty sure I want to run clustering (k-means) with looping for all unique combinations of the attributes Location and Illness. So I want to get 3 centriods for [Location A & Ebola] subset, three centroids for [Location B & Cold], [Location C & Cold], etc. The attributes Milestone 1, Milestone 2, Milestone Final are the numerical attributes I want to use for my clustering.
My data set is about 13,000 examples and I have some other polynomial attributes that aren't listed here.
Please forgive the formatting; here is a representative sample of the example set:
Contract ID Location Illness Contract Status Contract Type Begin Date Milestone 1 Milestone 2 Milestone Final
1 A Ebola Finished Big 1/10/2013 78 133 154
2 A Aids Unfinished Small 1/5/2009 1 125 162
3 A Cold Finished Big 8/17/2012 40 118 214
7 B Awesomeness Finished Small 9/27/2007 42 150 209
8 C Upset Stomach Unfinished Small 12/20/2009 10 101 219
9 D Ebola Finished Big 1/16/2009 9 111 246
10 D Headache Unfinished Big 9/11/2005 57 127 238
11 D Club Foot Unfinished Small 12/2/2005 55 141 204
12 D Aids Finished Small 2/3/2012 15 106 191
13 D Upset Stomach Finished Small 11/27/2009 48 103 194
14 D Ebola Finished Big 5/18/2005 86 101 160
15 D Ebola Finished Big 11/15/2009 7 148 164
16 D Pain Unfinished Small 5/25/2005 29 117 242
18 D Club foot Unfinished Big 4/28/2011 41 147 190
19 D Club foot Unfinished Small 4/20/2007 48 113 229
Also, any thoughts on how to learn to work with loops macros would be wonderful.
Thanks in advance for the advice!
I am a novice with RM and inexperienced with loops and macros. I need advice on how to structure a process to loop clustering. I am trying to get three centroids - a low / medium / high for each location and illness combination (see below). This will be used so when future data is received about how long a contract from [location B] for [pain] is taking I can tell whether it is taking too long, on track, or ahead of schedule.
I'm pretty sure I want to run clustering (k-means) with looping for all unique combinations of the attributes Location and Illness. So I want to get 3 centriods for [Location A & Ebola] subset, three centroids for [Location B & Cold], [Location C & Cold], etc. The attributes Milestone 1, Milestone 2, Milestone Final are the numerical attributes I want to use for my clustering.
My data set is about 13,000 examples and I have some other polynomial attributes that aren't listed here.
Please forgive the formatting; here is a representative sample of the example set:
Contract ID Location Illness Contract Status Contract Type Begin Date Milestone 1 Milestone 2 Milestone Final
1 A Ebola Finished Big 1/10/2013 78 133 154
2 A Aids Unfinished Small 1/5/2009 1 125 162
3 A Cold Finished Big 8/17/2012 40 118 214
7 B Awesomeness Finished Small 9/27/2007 42 150 209
8 C Upset Stomach Unfinished Small 12/20/2009 10 101 219
9 D Ebola Finished Big 1/16/2009 9 111 246
10 D Headache Unfinished Big 9/11/2005 57 127 238
11 D Club Foot Unfinished Small 12/2/2005 55 141 204
12 D Aids Finished Small 2/3/2012 15 106 191
13 D Upset Stomach Finished Small 11/27/2009 48 103 194
14 D Ebola Finished Big 5/18/2005 86 101 160
15 D Ebola Finished Big 11/15/2009 7 148 164
16 D Pain Unfinished Small 5/25/2005 29 117 242
18 D Club foot Unfinished Big 4/28/2011 41 147 190
19 D Club foot Unfinished Small 4/20/2007 48 113 229
Also, any thoughts on how to learn to work with loops macros would be wonderful.
Thanks in advance for the advice!
Tagged:
0
Answers