Multiple Labels for Binary Classification Problems in one model

rodienne_zammit · February 2019

Hello,

sgenzer I read them to the end!

Current approach: Separate models for different binary labels
I have prepared a decision tree model which correctly predicts a binary label for product A when Product A is used as the label.

For product B, I re-run the process to train a similar model when B is the label.

Then I would need to train another model to predict product C, and use C as a label. This goes on as in reality I have more products.

Desired approach: One model to predict different binary labels
Is there a way I can combine this into one model so that the model can tell me the binary predictions (true/false) for Product A, B and C in one go? This would be ideal when applying the model on new data so that I don't need to run all separate product models.

I tried to use "loop label" however this loops on the labels to create different models, and I did not find a way of how to use the models created to apply them to new data. I did not find a way how I could loop label on new data to apply "loop model" (this deosn't exist).

Maybe I could achieve this by combining the different binary classification values into one value?

Appreciate feedback on how it is best to implement this problem.

Thank you!

rodienne_zammit · March 2019

Thanks a lot @mschmitz for putting me on the right track. I looked into Polynominal by Binominal classification but I didn't manage to get what I want with it.

There might be other ways of doing this, but ..

I got the desired approach by looping on the product attributes using "Loop Attributes", this gives a macro name to the label, then inside the loop I set the field %{loop_attribute} as the label, and saved the model using the product name in the file name of the output, for example, save model as "C:\Documents\%{loop_attribute}.mod. I also used the "Annotate" operator with the performance and model output so that I can refer to the Annotation on the results and know which product the performance relates to.

To read and apply models on new data I used again the "Loop Label" and set the role of the product inside the loop and read the model from the file by using the macro value %{loop_label}. Again applying Annotate to the performance output helps me recognise which performance I am looking at.

XML sample for reading below:

<div><?xml version="1.0" encoding="UTF-8"?><process version="9.2.000"></div><div>&nbsp; <context></div><div>&nbsp; &nbsp; <input/></div><div>&nbsp; &nbsp; <output/></div><div>&nbsp; &nbsp; <macros/></div><div>&nbsp; </context></div><div>&nbsp; <operator activated="true" class="process" compatibility="9.2.000" expanded="true" name="Process"></div><div>&nbsp; &nbsp; <parameter key="logverbosity" value="init"/></div><div>&nbsp; &nbsp; <parameter key="random_seed" value="2001"/></div><div>&nbsp; &nbsp; <parameter key="send_mail" value="never"/></div><div>&nbsp; &nbsp; <parameter key="notification_email" value=""/></div><div>&nbsp; &nbsp; <parameter key="process_duration_for_mail" value="30"/></div><div>&nbsp; &nbsp; <parameter key="encoding" value="SYSTEM"/></div><div>&nbsp; &nbsp; <process expanded="true"></div><div>&nbsp; &nbsp; &nbsp; <operator activated="true" class="retrieve" compatibility="9.2.000" expanded="true" height="68" name="Retrieve Products" width="90" x="112" y="34"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="repository_entry" value="//Samples/data/Products"/></div><div>&nbsp; &nbsp; &nbsp; </operator></div><div>&nbsp; &nbsp; &nbsp; <operator activated="true" class="concurrency:loop_attributes" compatibility="9.2.000" expanded="true" height="103" name="Loop Attributes" width="90" x="313" y="34"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="attribute_filter_type" value="subset"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="attribute" value=""/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="attributes" value="Product ID"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="use_except_expression" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="value_type" value="attribute_value"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="use_value_type_exception" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="except_value_type" value="time"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="block_type" value="attribute_block"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="use_block_type_exception" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="except_block_type" value="value_matrix_row_start"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="invert_selection" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="include_special_attributes" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="attribute_name_macro" value="loop_attribute"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="reuse_results" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="enable_parallel_execution" value="true"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <process expanded="true"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <operator activated="true" class="set_role" compatibility="9.2.000" expanded="true" height="82" name="Set Role" width="90" x="45" y="34"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="attribute_name" value="%{loop_attribute}"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="target_role" value="label"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <list key="set_additional_roles"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </operator></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <operator activated="true" class="legacy:read_model" compatibility="9.2.000" expanded="true" height="68" name="Read Model" width="90" x="45" y="136"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="model_file" value="%{loop_attribute}_NewFeatures.mod"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </operator></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <operator activated="true" class="apply_model" compatibility="9.2.000" expanded="true" height="82" name="Apply Model (2)" width="90" x="179" y="85"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <list key="application_parameters"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="create_view" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </operator></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <operator activated="true" class="annotate" compatibility="9.2.000" expanded="true" height="68" name="Annotate" width="90" x="313" y="85"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <list key="annotations"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="Product" value="%{loop_attribute}"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </list></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="duplicate_annotations" value="overwrite"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </operator></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <operator activated="true" class="performance_binominal_classification" compatibility="9.2.000" expanded="true" height="82" name="Performance (Test Set)" width="90" x="447" y="85"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="main_criterion" value="first"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="accuracy" value="true"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="classification_error" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="kappa" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="AUC (optimistic)" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="AUC" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="AUC (pessimistic)" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="precision" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="recall" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="lift" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="fallout" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="f_measure" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="false_positive" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="false_negative" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="true_positive" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="true_negative" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="sensitivity" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="specificity" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="youden" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="positive_predictive_value" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="negative_predictive_value" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="psep" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="skip_undefined_labels" value="true"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="use_example_weights" value="true"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </operator></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <operator activated="true" class="annotate" compatibility="9.2.000" expanded="true" height="68" name="Annotate (2)" width="90" x="581" y="34"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <list key="annotations"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="Product" value="%{loop_attribute}"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </list></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="duplicate_annotations" value="overwrite"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </operator></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <connect from_port="input 1" to_op="Set Role" to_port="example set input"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <connect from_op="Set Role" from_port="example set output" to_op="Apply Model (2)" to_port="unlabelled data"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <connect from_op="Read Model" from_port="output" to_op="Apply Model (2)" to_port="model"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Annotate" to_port="input"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <connect from_op="Annotate" from_port="output" to_op="Performance (Test Set)" to_port="labelled data"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <connect from_op="Performance (Test Set)" from_port="performance" to_op="Annotate (2)" to_port="input"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <connect from_op="Performance (Test Set)" from_port="example set" to_port="output 2"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <connect from_op="Annotate (2)" from_port="output" to_port="output 1"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <portSpacing port="source_input 1" spacing="0"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <portSpacing port="source_input 2" spacing="0"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <portSpacing port="sink_output 1" spacing="0"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <portSpacing port="sink_output 2" spacing="0"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <portSpacing port="sink_output 3" spacing="0"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; </process></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <description align="center" color="transparent" colored="false" width="126">I looped on attribute because all my products were in a different column</description></div><div>&nbsp; &nbsp; &nbsp; </operator></div><div>&nbsp; &nbsp; &nbsp; <connect from_op="Retrieve Products" from_port="output" to_op="Loop Attributes" to_port="input 1"/></div><div>&nbsp; &nbsp; &nbsp; <connect from_op="Loop Attributes" from_port="output 1" to_port="result 1"/></div><div>&nbsp; &nbsp; &nbsp; <connect from_op="Loop Attributes" from_port="output 2" to_port="result 2"/></div><div>&nbsp; &nbsp; &nbsp; <portSpacing port="source_input 1" spacing="0"/></div><div>&nbsp; &nbsp; &nbsp; <portSpacing port="sink_result 1" spacing="0"/></div><div>&nbsp; &nbsp; &nbsp; <portSpacing port="sink_result 2" spacing="0"/></div><div>&nbsp; &nbsp; &nbsp; <portSpacing port="sink_result 3" spacing="0"/></div><div>&nbsp; &nbsp; </process></div><div>&nbsp; </operator></div><div></process></div>

MartinLiebig · March 2019

Hi,

did you have a look at Polynominal by Binominal classification? otherwise you can build something with Loop Values.

BR,

Martin

sgenzer · March 2019

@rodienne_zammit

SGolbert · March 2019

Hi @rodienne_zammit

I'm happy that you could find an answer. I have some questions about the use case: Why do you need individual classifiers for each product? Is it possible for a sample to be a member of more than one product cathegory?

Regards,

Sebastian

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Multiple Labels for Binary Classification Problems in one model

Best Answer

Answers