MetaData from port vs. new ExampleSetMetaData

HeikoPaulheimHeikoPaulheim Member Posts: 13 Contributor II
edited November 2018 in Help
Hi,

I have an operator which runs in a loop, where inside the loop, different sets of attributes are added (essentially: we want to evaulate how an algorithm behaves given that we add 5, 10, 20, 50 random attributes to the dataset).

Now, I have found that the data and the meta data are not always in sync when the operator is called. Inside doWork(), the method

inputPort.getMetaData()
sometimes delivers meta data from the previous loop, while

new ExampleSetMetaData(inputPort.getData(ExampleSet.class))
delivers the correct meta data. Something similar happens in the operator's metadata transformer.

However, although it seems to work, I find that solution a bit scary, since I am not sure which side effects it might trigger. What would be the clean way to get correct meta data?

Looking forward to your help,
Heiko
Tagged:

Answers

  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,993 RM Engineering
    Hi,

    MetaData is only meant to be used during process design time. It is not used at all when executing an operator. When you are inside your own doWork() method, you have the actual data at your disposal, so why not check the actual data directly?

    Regards,
    Marco
  • HeikoPaulheimHeikoPaulheim Member Posts: 13 Contributor II
    Hi Marco,

    OK, this may sound a bit weird, but: my operator essentially calls a nested subprocess with different manipulations of the meta data. More specifically, I set each attribute as a label attribute one after another, and call a subprocess that tries to predict that label from all other attributes, i.e., one subprocess for each attribute.

    In order to achieve this, I need to read and manipulate metadata in a loop inside the doWork() method. Simply carrying out all metadata operations in transformMetadata() wouldn't do the trick, as the meta data is dynamically transformed inside the loop in doWork().

    If there's any way to achieve this *without* using metadata in the doWork() method, I am happy to hear about it.

    Best,
    Heiko
  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,993 RM Engineering
    Hi,

    I assume you are creating ExecutionUnits for each of your subprocesses? If so, you can just call you can just call

    operator.addValue(new Value(key, value));
    on the operator you pass into the ExecutionUnit. Your own operator can then read said value by calling

    operator.getValue(key);
    and for example set the label to the attribute named as the value.

    Regards,
    Marco
Sign In or Register to comment.