Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

"Clustering with Loops?"

PrablyPrably Member Posts: 3 Contributor I
edited June 2019 in Help
Hi RM Masters!

I am a novice with RM and inexperienced with loops and macros. I need advice on how to structure a process to loop clustering. I am trying to get three centroids - a low / medium / high for each location and illness combination (see below). This will be used so when future data is received about how long a contract from [location B] for [pain] is taking I can tell whether it is taking too long, on track, or ahead of schedule.

I'm pretty sure I want  to run clustering (k-means) with looping for all unique combinations of the attributes Location and Illness. So I want to get 3 centriods for [Location A & Ebola] subset, three centroids for [Location B & Cold], [Location C & Cold], etc. The attributes Milestone 1, Milestone 2, Milestone Final are the numerical attributes I want to use for my clustering.

My data set is about 13,000 examples and I have some other polynomial attributes that aren't listed here.

Please forgive the formatting; here is a representative sample of the example set:


Contract ID   Location   Illness             Contract Status Contract Type     Begin Date           Milestone 1    Milestone 2   Milestone Final
1                       A   Ebola               Finished               Big               1/10/2013               78                 133                 154
2                     A             Aids             Unfinished               Small               1/5/2009               1               125               162
3                     A           Cold               Finished               Big               8/17/2012               40               118               214
7                     B         Awesomeness   Finished       Small               9/27/2007               42               150               209
8                       C           Upset Stomach     Unfinished         Small     12/20/2009               10               101               219
9                     D               Ebola                   Finished               Big               1/16/2009               9               111               246
10                     D             Headache       Unfinished       Big               9/11/2005               57               127               238
11                     D             Club Foot       Unfinished     Small               12/2/2005               55               141               204
12                     D                 Aids                 Finished             Small                     2/3/2012         15               106               191
13                     D                 Upset Stomach Finished             Small               11/27/2009         48               103               194
14                     D                   Ebola               Finished       Big               5/18/2005                 86               101               160
15                     D                Ebola                     Finished       Big               11/15/2009         7               148               164
16                     D                   Pain             Unfinished       Small               5/25/2005               29                    117               242
18                     D                 Club foot             Unfinished       Big               4/28/2011               41               147               190
19                     D                 Club foot             Unfinished       Small               4/20/2007               48               113               229

Also, any thoughts on how to learn to work with loops macros would be wonderful.

Thanks in advance for the advice!

Answers

  • jaysunice3401jaysunice3401 Member Posts: 6 Contributor II
    This might help.  First, use the Generate Concatenation operator to create a new field that concatenates Location and Illness.  Then, feed that into a Loop Values operator.  When you're in the Subprocess for the loop, you will want to filter based on your new concatenated attribute.  The trick being, you will want to use a %{loop_value} -- that is, Location_Illness=%{loop_value}.  Then, just continue from there.  Hope this helps.
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.3.000">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="5.3.000" expanded="true" name="Process">
       <process expanded="true" height="460" width="547">
         <operator activated="true" class="generate_concatenation" compatibility="5.3.000" expanded="true" height="76" name="Generate Concatenation" width="90" x="179" y="165">
           <parameter key="first_attribute" value="Illness"/>
           <parameter key="second_attribute" value="Location"/>
         </operator>
         <operator activated="true" class="loop_values" compatibility="5.3.000" expanded="true" height="76" name="Loop Values" width="90" x="313" y="165">
           <parameter key="attribute" value="Illness_Location"/>
           <process expanded="true" height="663" width="887">
             <operator activated="true" class="filter_examples" compatibility="5.3.000" expanded="true" height="76" name="Filter Examples" width="90" x="179" y="30">
               <parameter key="condition_class" value="attribute_value_filter"/>
               <parameter key="parameter_string" value="Illness_Location=%{loop_value}"/>
             </operator>
             <connect from_port="example set" to_op="Filter Examples" to_port="example set input"/>
             <connect from_op="Filter Examples" from_port="example set output" to_port="out 1"/>
             <portSpacing port="source_example set" spacing="0"/>
             <portSpacing port="sink_out 1" spacing="0"/>
             <portSpacing port="sink_out 2" spacing="0"/>
           </process>
         </operator>
         <connect from_op="Generate Concatenation" from_port="example set output" to_op="Loop Values" to_port="example set"/>
         <connect from_op="Loop Values" from_port="out 1" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
    </process>
Sign In or Register to comment.