Transfer learning

wesselwessel Member Posts: 537 Maven
edited November 2018 in Help
Dear All,

I have 2 datasets, 1 called source, 1 called target. The target and source dataset have identical features. The only difference is that the class labels are different.
labels(source) = {A, B}
labels(target) = {C, D}


I wish to build a model on my source dataset, and then apply this model on my target dataset. Then I wish to append these predictions, as a new column, to the target dataset. How can I make RapidMiner do this? The modelApplier module insists labels be the same, and errors when they differ.

Regards,

Wessel

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    theoretically there's no need for having labels in the apply set at all. So you might simply remove the label. The AttributeSubsetpreprocessing operator might help you, if you apply the model inside and only filter out the label.

    Greetings,
      Sebastian
  • wesselwessel Member Posts: 537 Maven
    This seems to be a lot harder then I thought.

    I can create an attribute called "prediction(class)" and add this to a dataset.
    But when I repeat this process, it adds a new "prediction(class attribute)" and the old one gets overwritten.

    Also its better to not IOStore instead or arff writes I think.
    I guess I want to create a building block:
    input:  dataset SMALL, dataset BIG
    output: dataset NEW_SMALL

    function IM-transfer(dataset TARGET, dataset SOURCE) {
          NEW_TARGET = TARGET;
          NEW_SOURCE = SOURCE;

          do 3 times {
                NEW_TARGET = apply_model(j48(NEW_SOURCE), NEW_TARGET);
                NEW_SOURCE = apply_model(j48(NEW_TARGET), NEW_SOURCE);
          }

      return NEW_TARGET
    }


    <operator name="Root" class="Process" expanded="yes">
        <operator name="target" class="ArffExampleSource">
            <parameter key="data_file" value="D:\wessel\Desktop\test1.arff"/>
            <parameter key="label_attribute" value="class"/>
        </operator>
        <operator name="W-J48 on target" class="W-J48">
            <parameter key="L" value="true"/>
        </operator>
        <operator name="source" class="ArffExampleSource">
            <parameter key="data_file" value="D:\wessel\Desktop\test2.arff"/>
            <parameter key="label_attribute" value="class"/>
        </operator>
        <operator name="target to source" class="ModelApplier">
            <list key="application_parameters">
            </list>
        </operator>
        <operator name="new source" class="ChangeAttributeRole">
            <parameter key="name" value="prediction(class)"/>
        </operator>
        <operator name="source writer" class="ArffExampleSetWriter">
            <parameter key="example_set_file" value="D:\wessel\Desktop\test2.arff"/>
        </operator>
        <operator name="W-J48 on source" class="W-J48">
            <parameter key="L" value="true"/>
        </operator>
        <operator name="target (2)" class="ArffExampleSource">
            <parameter key="data_file" value="D:\wessel\Desktop\test1.arff"/>
            <parameter key="label_attribute" value="class"/>
        </operator>
        <operator name="new source to target" class="ModelApplier">
            <list key="application_parameters">
            </list>
        </operator>
        <operator name="new target" class="ChangeAttributeRole">
            <parameter key="name" value="prediction(class)"/>
        </operator>
        <operator name="ArffExampleSetWriter" class="ArffExampleSetWriter">
            <parameter key="example_set_file" value="D:\wessel\Desktop\test1.arff"/>
        </operator>
    </operator>
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    prediction(class) seems to be correct, since the label attribute is called class. But you are setting the old attribute to regular. For learning on the predictions, you have to set it to the role "label".
    Please keep in mind, that you can only have one attribute with the same role and one attribute with the same name. If an attribute is set to a name or role already present, it will replace the old one. To avoid this, you will need to change the role or the name of the old attribute first. It is possible to set arbitrary roles, which may not correspond to the predefined ones. So "label_1" is a possible value, but these roles will be ignored by RapidMiner's operators.

    Greetings,
      Sebastian
Sign In or Register to comment.