The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
MultipleLabelIterator not allowing new iterations to overwrite values
I think I've developed a simple example which shows that MultipleLabelIterator is somehow not overwriting values defined in previous iterations with new data. Please correct me if I've got this wrong:
What this example is trying to do is:
1) Create a simple example set with two labels (label1, label2)
2) Use MultipleLabelIterator to do the following on each label
3) Run Linear Regression
4) Apply model
5) Compute a new attribute based on value of the predictions of the model.
The problem is that, in step 5, the attribute "prediction(label_1)" changes name with each iteration. So what we ideally want to do is specify "prediction(label_%{a}). However, that doesn't work inside a FeatureGenerator computation, so the workaround suggested in another thread was to rename it to a static name. So the example changes it to "pred_val" and then has a two-step FeatureGenerator to generate a final value called pred_val_sq. Since I want to retain this for each run, I then rename it to "pred_val_sq_%{a}".
This works on the first iteration. However, on the 2nd iteration, the values of "pred_val_sq" still retain the values from the previous iteration, even though that attribute was renamed.
I realize this explanation is somewhat convoluted. Hopefully it will be clearer once the operator chain is run, and you see that "pred_val_sq_1" and "pred_val_sq_2" have identical values.
Keith
What this example is trying to do is:
1) Create a simple example set with two labels (label1, label2)
2) Use MultipleLabelIterator to do the following on each label
3) Run Linear Regression
4) Apply model
5) Compute a new attribute based on value of the predictions of the model.
The problem is that, in step 5, the attribute "prediction(label_1)" changes name with each iteration. So what we ideally want to do is specify "prediction(label_%{a}). However, that doesn't work inside a FeatureGenerator computation, so the workaround suggested in another thread was to rename it to a static name. So the example changes it to "pred_val" and then has a two-step FeatureGenerator to generate a final value called pred_val_sq. Since I want to retain this for each run, I then rename it to "pred_val_sq_%{a}".
This works on the first iteration. However, on the 2nd iteration, the values of "pred_val_sq" still retain the values from the previous iteration, even though that attribute was renamed.
I realize this explanation is somewhat convoluted. Hopefully it will be clearer once the operator chain is run, and you see that "pred_val_sq_1" and "pred_val_sq_2" have identical values.
Thanks for any assistance.
<operator name="Root" class="Process" expanded="yes">
<operator name="ExampleSetGenerator" class="ExampleSetGenerator">
<parameter key="target_function" value="polynomial"/>
</operator>
<operator name="Change name: label to label1" class="ChangeAttributeRole">
<parameter key="name" value="label"/>
<parameter key="target_role" value="label1"/>
</operator>
<operator name="Change role of label1 to label_1" class="ChangeAttributeName">
<parameter key="new_name" value="label_1"/>
<parameter key="old_name" value="label"/>
</operator>
<operator name="Create attrib: label2" class="FeatureGeneration">
<list key="functions">
<parameter key="label_2" value="+(att1,+(att2,+(att3,+(att4,att5))))"/>
</list>
<parameter key="keep_all" value="true"/>
</operator>
<operator name="Change role of label2 to label_2" class="ChangeAttributeRole">
<parameter key="name" value="label_2"/>
<parameter key="target_role" value="label2"/>
</operator>
<operator name="MultipleLabelIterator" class="MultipleLabelIterator" expanded="yes">
<operator name="LinearRegression" class="LinearRegression">
<parameter key="keep_example_set" value="true"/>
</operator>
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
<operator name="Remove iteration# from prediction name" class="ChangeAttributeName">
<parameter key="new_name" value="pred_val"/>
<parameter key="old_name" value="prediction(label_%{a})"/>
</operator>
<operator name="Create value derived from pred_val" class="FeatureGeneration" breakpoints="before,after">
<list key="functions">
<parameter key="pred_val_step1" value="+(pred_val,const[1]())"/>
<parameter key="pred_val_sq" value="*(pred_val_step1,pred_val_step1)"/>
</list>
<parameter key="keep_all" value="true"/>
</operator>
<operator name="Change name to add iteration# back to pred_val" class="ChangeAttributeName">
<parameter key="new_name" value="pred_val_%{a}"/>
<parameter key="old_name" value="pred_val"/>
</operator>
<operator name="Rename derived value to add iteration#" class="ChangeAttributeName">
<parameter key="new_name" value="pred_val_sq_%{a}"/>
<parameter key="old_name" value="pred_val_sq"/>
</operator>
</operator>
</operator>
Keith
0
Answers
although I must admit I did not yet fully checked your problem, just a remark: we are currently working on the feature generation functionality. This is among others due to the reason that it simply does not work properly in all cases, especially when it is combined with transforming example sets by adding/removing attributes, etc. It may be, that this is the problem here as well.
Cheers,
Tobias
I just wanted to let you know that this behaviour is indeed a result of the combination of the loop and the feature generation - the values are not re-created in the second (and following) loops due to an optimization in the feature generation operator. As Tobias said, we are currently in the process of re-implementing the whole feature generation approach and this behaviour will change for the next major upgrade.
Cheers,
Ingo