RapidMiner

Multicriteria Optimization based on different models

Contributor II

Multicriteria Optimization based on different models

Hi,

Is it possible to apply three different models with three different target variables (A,B,C) on one dataset simultaneously?

I already built three different models based on the same dataset. One predicts target A, the second target B and the third target C.

My objective is to conduct a multicriteria optimization, so that I receive certain values for A, B and C.

Thank you very much in advance!

 

18 REPLIES
Community Manager

Re: Multicriteria Optimization based on different models

If I understand you correctly, you've already trained three different models on the same data set and now you want to take a scoring data set and pass that to the the 3 different models? If yes, then just use a Multiply operator to make 3 copies of the scoring data and then use 3 Apply Model operators to apply your three models. 

Regards,
Thomas - Community Manager
LinkedIn: Thomas Ott
Contributor II

Re: Multicriteria Optimization based on different models

Thank you for your quick reply!

Unfortunately, I already tried this solution. The problem is that I need only one result table, meaning one table including the predictions of target A, B and C.

The labels are polynominal - they all contain three categories: small, medium, large. My objective is to receive a prediction for each dataset containing a predicted value for A, B and C.

 

Moderator

Re: Multicriteria Optimization based on different models

Hi,

 

cant you just chain 3x Apply Model with Set Role (to avoid collisions between prediction roles) after another?

 

Best,

Martin

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
Contributor II

Re: Multicriteria Optimization based on different models

Yes, but then I have three example sets. Is there no opportunity to get only one?

Moderator

Re: Multicriteria Optimization based on different models

Hi,

i think the answer is a double no.

 

First of all multiply is not generating a in-memory copy of an example set. We work with a view concept, thus the data will most likely not be copied.

 

Further you can simply chain the applies. You just need to be sure to change the roles of prediction/confidence because they need to be unique. Have a look at the attached process. The only manual thing is setting the roles correctly. That might be doable with a tiny script if it annoys you too much.

 

Best,

Martin

<?xml version="1.0" encoding="UTF-8"?><process version="7.5.003">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.5.003" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="7.5.003" expanded="true" height="68" name="Retrieve Sonar" width="90" x="45" y="187">
        <parameter key="repository_entry" value="//Samples/data/Sonar"/>
      </operator>
      <operator activated="true" class="multiply" compatibility="7.5.003" expanded="true" height="103" name="Multiply" width="90" x="179" y="187"/>
      <operator activated="true" class="set_role" compatibility="7.5.003" expanded="true" height="82" name="Set Role (2)" width="90" x="380" y="187">
        <parameter key="attribute_name" value="attribute_10"/>
        <parameter key="target_role" value="label"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="concurrency:cross_validation" compatibility="7.5.003" expanded="true" height="145" name="Validation (2)" width="90" x="514" y="187">
        <parameter key="sampling_type" value="shuffled sampling"/>
        <process expanded="true">
          <operator activated="true" class="h2o:generalized_linear_model" compatibility="7.5.000" expanded="true" height="103" name="Generalized Linear Model" width="90" x="179" y="34">
            <list key="beta_constraints"/>
            <list key="expert_parameters"/>
          </operator>
          <connect from_port="training set" to_op="Generalized Linear Model" to_port="training set"/>
          <connect from_op="Generalized Linear Model" from_port="model" to_port="model"/>
          <portSpacing port="source_training set" spacing="0"/>
          <portSpacing port="sink_model" spacing="0"/>
          <portSpacing port="sink_through 1" spacing="0"/>
        </process>
        <process expanded="true">
          <operator activated="true" class="apply_model" compatibility="7.5.003" expanded="true" height="82" name="Apply Model (2)" width="90" x="45" y="34">
            <list key="application_parameters"/>
          </operator>
          <operator activated="true" class="performance" compatibility="7.5.003" expanded="true" height="82" name="Performance (2)" width="90" x="179" y="34"/>
          <connect from_port="model" to_op="Apply Model (2)" to_port="model"/>
          <connect from_port="test set" to_op="Apply Model (2)" to_port="unlabelled data"/>
          <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Performance (2)" to_port="labelled data"/>
          <connect from_op="Performance (2)" from_port="performance" to_port="performance 1"/>
          <connect from_op="Performance (2)" from_port="example set" to_port="test set results"/>
          <portSpacing port="source_model" spacing="0"/>
          <portSpacing port="source_test set" spacing="0"/>
          <portSpacing port="source_through 1" spacing="0"/>
          <portSpacing port="sink_test set results" spacing="0"/>
          <portSpacing port="sink_performance 1" spacing="0"/>
          <portSpacing port="sink_performance 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="set_role" compatibility="7.5.003" expanded="true" height="82" name="Set Role" width="90" x="380" y="34">
        <parameter key="attribute_name" value="attribute_1"/>
        <parameter key="target_role" value="label"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="concurrency:cross_validation" compatibility="7.5.003" expanded="true" height="145" name="Validation" width="90" x="514" y="34">
        <parameter key="sampling_type" value="shuffled sampling"/>
        <process expanded="true">
          <operator activated="true" class="h2o:generalized_linear_model" compatibility="7.5.000" expanded="true" height="103" name="Generalized Linear Model (2)" width="90" x="313" y="34">
            <list key="beta_constraints"/>
            <list key="expert_parameters"/>
          </operator>
          <connect from_port="training set" to_op="Generalized Linear Model (2)" to_port="training set"/>
          <connect from_op="Generalized Linear Model (2)" from_port="model" to_port="model"/>
          <connect from_op="Generalized Linear Model (2)" from_port="weights" to_port="through 1"/>
          <portSpacing port="source_training set" spacing="0"/>
          <portSpacing port="sink_model" spacing="0"/>
          <portSpacing port="sink_through 1" spacing="0"/>
          <portSpacing port="sink_through 2" spacing="0"/>
        </process>
        <process expanded="true">
          <operator activated="true" class="apply_model" compatibility="7.5.003" expanded="true" height="82" name="Apply Model" width="90" x="45" y="34">
            <list key="application_parameters"/>
          </operator>
          <operator activated="true" class="performance" compatibility="7.5.003" expanded="true" height="82" name="Performance" width="90" x="179" y="34"/>
          <connect from_port="model" to_op="Apply Model" to_port="model"/>
          <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
          <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
          <connect from_op="Performance" from_port="performance" to_port="performance 1"/>
          <connect from_op="Performance" from_port="example set" to_port="test set results"/>
          <portSpacing port="source_model" spacing="0"/>
          <portSpacing port="source_test set" spacing="0"/>
          <portSpacing port="source_through 1" spacing="0"/>
          <portSpacing port="source_through 2" spacing="0"/>
          <portSpacing port="sink_test set results" spacing="0"/>
          <portSpacing port="sink_performance 1" spacing="0"/>
          <portSpacing port="sink_performance 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="retrieve" compatibility="7.5.003" expanded="true" height="68" name="Retrieve Sonar (2)" width="90" x="514" y="391">
        <parameter key="repository_entry" value="//Samples/data/Sonar"/>
      </operator>
      <operator activated="true" class="apply_model" compatibility="7.5.003" expanded="true" height="82" name="Apply Model (3)" width="90" x="715" y="289">
        <list key="application_parameters"/>
      </operator>
      <operator activated="true" class="set_role" compatibility="7.5.003" expanded="true" height="82" name="Set Role (3)" width="90" x="849" y="289">
        <parameter key="attribute_name" value="prediction(attribute_10)"/>
        <parameter key="target_role" value="pra"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="apply_model" compatibility="7.5.003" expanded="true" height="82" name="Apply Model (4)" width="90" x="983" y="187">
        <list key="application_parameters"/>
      </operator>
      <connect from_op="Retrieve Sonar" from_port="output" to_op="Multiply" to_port="input"/>
      <connect from_op="Multiply" from_port="output 1" to_op="Set Role" to_port="example set input"/>
      <connect from_op="Multiply" from_port="output 2" to_op="Set Role (2)" to_port="example set input"/>
      <connect from_op="Set Role (2)" from_port="example set output" to_op="Validation (2)" to_port="example set"/>
      <connect from_op="Validation (2)" from_port="model" to_op="Apply Model (3)" to_port="model"/>
      <connect from_op="Set Role" from_port="example set output" to_op="Validation" to_port="example set"/>
      <connect from_op="Validation" from_port="model" to_op="Apply Model (4)" to_port="model"/>
      <connect from_op="Retrieve Sonar (2)" from_port="output" to_op="Apply Model (3)" to_port="unlabelled data"/>
      <connect from_op="Apply Model (3)" from_port="labelled data" to_op="Set Role (3)" to_port="example set input"/>
      <connect from_op="Set Role (3)" from_port="example set output" to_op="Apply Model (4)" to_port="unlabelled data"/>
      <connect from_op="Apply Model (4)" from_port="labelled data" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
RMStaff

Re: Multicriteria Optimization based on different models

[ Edited ]

You have to consider the correlation between the variables, before you can apply this approach. If the features used to predict A are independent from the features to predict B (and so forth with C, D, etc.), then it is safe to apply separate models for each of them. If that's not the case, you will have biased predictions.

 

Take a look at this Stack Exchange post:

 

https://stats.stackexchange.com/questions/18151/methods-to-predict-multiple-dependent-variables

 

EDIT: Actually you can apply the different models and then look for correlations between the residuals. If you observe no correlation, then the approach is correct. 

 

 

Best,

Sebastian

Contributor II

Re: Multicriteria Optimization based on different models

Thank you! I think I understood what you suggested.

However, I am still struggling with the solution. I tried to import your process but it does not work with my version (7.2.002).

With the three example sets, I meant that I received three data sets with predicted values- so this had nothing to do with the "multiply" operator.

Furthermore, I want to apply my already built models, therefore I have three operators named "retrieve model". When I chain the "apply model" operators including the "set role" operators, I only receive the example set with the last label. This is not surprising as the data set to be included into the "apply model" operator is usually unlabeled.

 

Do you have another idea how to solve this problem?

 

Highlighted
Moderator

Re: Multicriteria Optimization based on different models

Hi,

 

any chance you can post a dummy process (or two)?

 

Best,

Martin

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
Contributor II

Re: Multicriteria Optimization based on different models

Does this one help?

<?xml version="1.0" encoding="UTF-8"?><process version="7.2.002">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.2.002" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="7.2.002" expanded="true" height="68" name="Retrieve dataset xy" width="90" x="45" y="85">
        <parameter key="repository_entry" value="//dataset xy"/>
      </operator>
      <operator activated="true" class="retrieve" compatibility="7.2.002" expanded="true" height="68" name="Retrieve Model_small" width="90" x="246" y="340">
        <parameter key="repository_entry" value="//Model_small"/>
      </operator>
      <operator activated="true" class="retrieve" compatibility="7.2.002" expanded="true" height="68" name="Retrieve Model_large" width="90" x="45" y="187">
        <parameter key="repository_entry" value="//Model_large"/>
      </operator>
      <operator activated="true" class="retrieve" compatibility="7.2.002" expanded="true" height="68" name="Retrieve Model_medium" width="90" x="112" y="289">
        <parameter key="repository_entry" value="//Model_Medium"/>
      </operator>
      <operator activated="true" class="select_attributes" compatibility="7.2.002" expanded="true" height="82" name="Select Attributes" width="90" x="112" y="34">
        <parameter key="attribute_filter_type" value="subset"/>
        <parameter key="attributes" value="........."/>
      </operator>
      <operator activated="true" class="set_role" compatibility="7.2.002" expanded="true" height="82" name="Set Role" width="90" x="246" y="34">
        <parameter key="attribute_name" value="Class_Large"/>
        <parameter key="target_role" value="label"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="apply_model" compatibility="7.2.002" expanded="true" height="82" name="Apply Model (2)" width="90" x="380" y="34">
        <list key="application_parameters"/>
      </operator>
      <operator activated="true" class="select_attributes" compatibility="7.2.002" expanded="true" height="82" name="Select Attributes (2)" width="90" x="447" y="34">
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <operator activated="true" class="set_role" compatibility="7.2.002" expanded="true" height="82" name="Set Role (2)" width="90" x="581" y="34">
        <parameter key="attribute_name" value="Class_Medium"/>
        <parameter key="target_role" value="label"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="apply_model" compatibility="7.2.002" expanded="true" height="82" name="Apply Model" width="90" x="313" y="187">
        <list key="application_parameters"/>
      </operator>
      <operator activated="true" class="set_role" compatibility="7.2.002" expanded="true" height="82" name="Set Role (3)" width="90" x="447" y="187">
        <parameter key="attribute_name" value="Class_Small"/>
        <parameter key="target_role" value="label"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="apply_model" compatibility="7.2.002" expanded="true" height="82" name="Apply Model (3)" width="90" x="715" y="136">
        <list key="application_parameters"/>
      </operator>
      <connect from_op="Retrieve dataset xy" from_port="output" to_op="Select Attributes" to_port="example set input"/>
      <connect from_op="Retrieve Model_small" from_port="output" to_op="Apply Model (3)" to_port="model"/>
      <connect from_op="Retrieve Model_large" from_port="output" to_op="Apply Model (2)" to_port="model"/>
      <connect from_op="Retrieve Model_medium" from_port="output" to_op="Apply Model" to_port="model"/>
      <connect from_op="Select Attributes" from_port="example set output" to_op="Set Role" to_port="example set input"/>
      <connect from_op="Set Role" from_port="example set output" to_op="Apply Model (2)" to_port="unlabelled data"/>
      <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Select Attributes (2)" to_port="example set input"/>
      <connect from_op="Select Attributes (2)" from_port="original" to_op="Set Role (2)" to_port="example set input"/>
      <connect from_op="Set Role (2)" from_port="example set output" to_op="Apply Model" to_port="unlabelled data"/>
      <connect from_op="Apply Model" from_port="labelled data" to_op="Set Role (3)" to_port="example set input"/>
      <connect from_op="Set Role (3)" from_port="example set output" to_op="Apply Model (3)" to_port="unlabelled data"/>
      <connect from_op="Apply Model (3)" from_port="labelled data" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>