Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
"An error in aggregation operator kmeans"
Hi,
This is my code:
<operator name="Root" class="Process" expanded="yes">
<operator name="ExampleSetGenerator" class="ExampleSetGenerator">
<parameter key="number_examples" value="25"/>
<parameter key="target_function" value="random"/>
</operator>
<operator name="AttributeFilter" class="AttributeFilter">
<parameter key="condition_class" value="attribute_name_filter"/>
<parameter key="parameter_string" value="att.*"/>
</operator>
<operator name="KMeans" class="KMeans">
<parameter key="k" value="3"/>
</operator>
<operator name="Aggregation" class="Aggregation">
<list key="aggregation_attributes">
<parameter key="att1" value="average"/>
</list>
<parameter key="group_by_attributes" value="cluster"/>
</operator>
</operator>
I get the error that the attribute 'cluster' does not exist. But after running the kmeans, a new attribute 'cluster' was created in the exampleset. So, why is this error? Or is it reading the initial input example set ? How do i tell RM to read that particular data which was generated by applying the kmeans?
Thanks, Shubha
This is my code:
<operator name="Root" class="Process" expanded="yes">
<operator name="ExampleSetGenerator" class="ExampleSetGenerator">
<parameter key="number_examples" value="25"/>
<parameter key="target_function" value="random"/>
</operator>
<operator name="AttributeFilter" class="AttributeFilter">
<parameter key="condition_class" value="attribute_name_filter"/>
<parameter key="parameter_string" value="att.*"/>
</operator>
<operator name="KMeans" class="KMeans">
<parameter key="k" value="3"/>
</operator>
<operator name="Aggregation" class="Aggregation">
<list key="aggregation_attributes">
<parameter key="att1" value="average"/>
</list>
<parameter key="group_by_attributes" value="cluster"/>
</operator>
</operator>
I get the error that the attribute 'cluster' does not exist. But after running the kmeans, a new attribute 'cluster' was created in the exampleset. So, why is this error? Or is it reading the initial input example set ? How do i tell RM to read that particular data which was generated by applying the kmeans?
Thanks, Shubha
0
Answers
this error happens because the [tt]Aggregation[/tt] operator only searches through the regular attributes when matching attribute names. We will add a parameter [tt]work_on_special[/tt] in the near future. Until then you have to change the type of the cluster attribute to regular before applying the [tt]Aggregation[/tt].
Kind regards,
Tobias
One more question, can i specicy all the variables namely (att1, att2, att3, att4, att5) in the aggregate function? (in the above code i posted, only att1 is used). I tried by using the regular expression, att.*. But there is an error, "The attribute 'att.*' doesn't exist". I am sure that i am missing something... What could it be?
Thanks again,
Shubha
Tobias
Thanks, Shubha
This does different i guess. But, surely this will answer another question of mine. AttributeAggregation is something which I learnt new today. Thanks.
What i need was for each group of nominal cluster attribute, i need the average of all the 'att' attributes, (i.e., The above can do averages row-wise, but actually i need column-wise) without actually specifying each of the variables.
Secondly, unlike 'Aggregation', the operator 'AttributeAggregation' will not perform the operation groupwise.
Thirdly, if my attrubutes have different names, unlike att1, att2,... i cant use the regular expressions too...
Thanking you,
Shubha
I can also see the application of feature here.. ... Many Thanks for clearing all my queries...