The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
Transfer learning
Dear All,
I have 2 datasets, 1 called source, 1 called target. The target and source dataset have identical features. The only difference is that the class labels are different.
labels(source) = {A, B}
labels(target) = {C, D}
I wish to build a model on my source dataset, and then apply this model on my target dataset. Then I wish to append these predictions, as a new column, to the target dataset. How can I make RapidMiner do this? The modelApplier module insists labels be the same, and errors when they differ.
Regards,
Wessel
I have 2 datasets, 1 called source, 1 called target. The target and source dataset have identical features. The only difference is that the class labels are different.
labels(source) = {A, B}
labels(target) = {C, D}
I wish to build a model on my source dataset, and then apply this model on my target dataset. Then I wish to append these predictions, as a new column, to the target dataset. How can I make RapidMiner do this? The modelApplier module insists labels be the same, and errors when they differ.
Regards,
Wessel
0
Answers
theoretically there's no need for having labels in the apply set at all. So you might simply remove the label. The AttributeSubsetpreprocessing operator might help you, if you apply the model inside and only filter out the label.
Greetings,
Sebastian
I can create an attribute called "prediction(class)" and add this to a dataset.
But when I repeat this process, it adds a new "prediction(class attribute)" and the old one gets overwritten.
Also its better to not IOStore instead or arff writes I think.
I guess I want to create a building block:
input: dataset SMALL, dataset BIG
output: dataset NEW_SMALL
function IM-transfer(dataset TARGET, dataset SOURCE) {
NEW_TARGET = TARGET;
NEW_SOURCE = SOURCE;
do 3 times {
NEW_TARGET = apply_model(j48(NEW_SOURCE), NEW_TARGET);
NEW_SOURCE = apply_model(j48(NEW_TARGET), NEW_SOURCE);
}
return NEW_TARGET
}
<operator name="Root" class="Process" expanded="yes">
<operator name="target" class="ArffExampleSource">
<parameter key="data_file" value="D:\wessel\Desktop\test1.arff"/>
<parameter key="label_attribute" value="class"/>
</operator>
<operator name="W-J48 on target" class="W-J48">
<parameter key="L" value="true"/>
</operator>
<operator name="source" class="ArffExampleSource">
<parameter key="data_file" value="D:\wessel\Desktop\test2.arff"/>
<parameter key="label_attribute" value="class"/>
</operator>
<operator name="target to source" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
<operator name="new source" class="ChangeAttributeRole">
<parameter key="name" value="prediction(class)"/>
</operator>
<operator name="source writer" class="ArffExampleSetWriter">
<parameter key="example_set_file" value="D:\wessel\Desktop\test2.arff"/>
</operator>
<operator name="W-J48 on source" class="W-J48">
<parameter key="L" value="true"/>
</operator>
<operator name="target (2)" class="ArffExampleSource">
<parameter key="data_file" value="D:\wessel\Desktop\test1.arff"/>
<parameter key="label_attribute" value="class"/>
</operator>
<operator name="new source to target" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
<operator name="new target" class="ChangeAttributeRole">
<parameter key="name" value="prediction(class)"/>
</operator>
<operator name="ArffExampleSetWriter" class="ArffExampleSetWriter">
<parameter key="example_set_file" value="D:\wessel\Desktop\test1.arff"/>
</operator>
</operator>
prediction(class) seems to be correct, since the label attribute is called class. But you are setting the old attribute to regular. For learning on the predictions, you have to set it to the role "label".
Please keep in mind, that you can only have one attribute with the same role and one attribute with the same name. If an attribute is set to a name or role already present, it will replace the old one. To avoid this, you will need to change the role or the name of the old attribute first. It is possible to set arbitrary roles, which may not correspond to the predefined ones. So "label_1" is a possible value, but these roles will be ignored by RapidMiner's operators.
Greetings,
Sebastian