Multiple Labels for Binary Classification Problems in one model

rodienne_zammitrodienne_zammit Member Posts: 3 Contributor I
edited February 2019 in Help
Hello,

sgenzer I read them to the end!

Current approach: Separate models for different binary labels
I have prepared a decision tree model which correctly predicts a binary label for product A when Product A is used as the label.

For product B, I re-run the process to train a similar model when B is the label.

Then I would need to train another model to predict product C, and use C as a label. This goes on as in reality I have more products.

Desired approach: One model to predict different binary labels
Is there a way I can combine this into one model so that the model can tell me the binary predictions (true/false) for Product A, B and C in one go? This would be ideal when applying the model on new data so that I don't need to run all separate product models. 

I tried to use "loop label" however this loops on the labels to create different models, and I did not find a way of how to use the models created to apply them to new data. I did not find a way how I could loop label on new data to apply "loop model" (this deosn't exist).

Maybe I could achieve this by combining the different binary classification values into one value? 

Appreciate feedback on how it is best to implement this problem.

Thank you!

Best Answer

  • rodienne_zammitrodienne_zammit Member Posts: 3 Contributor I
    edited March 2019 Solution Accepted
    Thanks a lot @mschmitz for putting me on the right track. I looked into Polynominal by Binominal classification but I didn't manage to get what I want with it. 

    There might be other ways of doing this, but ..

    I got the desired approach by looping on the product attributes using "Loop Attributes", this gives a macro name to the label, then inside the loop I set the field %{loop_attribute} as the label, and saved the model using the product name in the file name of the output, for example, save model as "C:\Documents\%{loop_attribute}.mod. I also used the "Annotate" operator with the performance and model output so that I can refer to the Annotation on the results and know which product the performance relates to.

    To read and apply models on new data I used again the "Loop Label" and set the role of the product inside the loop and read the model from the file by using the macro value %{loop_label}.  Again applying Annotate to the performance output helps me recognise which performance I am looking at.

    XML sample for reading below:

    <div><?xml version="1.0" encoding="UTF-8"?><process version="9.2.000"></div><div>&nbsp; <context></div><div>&nbsp; &nbsp; <input/></div><div>&nbsp; &nbsp; <output/></div><div>&nbsp; &nbsp; <macros/></div><div>&nbsp; </context></div><div>&nbsp; <operator activated="true" class="process" compatibility="9.2.000" expanded="true" name="Process"></div><div>&nbsp; &nbsp; <parameter key="logverbosity" value="init"/></div><div>&nbsp; &nbsp; <parameter key="random_seed" value="2001"/></div><div>&nbsp; &nbsp; <parameter key="send_mail" value="never"/></div><div>&nbsp; &nbsp; <parameter key="notification_email" value=""/></div><div>&nbsp; &nbsp; <parameter key="process_duration_for_mail" value="30"/></div><div>&nbsp; &nbsp; <parameter key="encoding" value="SYSTEM"/></div><div>&nbsp; &nbsp; <process expanded="true"></div><div>&nbsp; &nbsp; &nbsp; <operator activated="true" class="retrieve" compatibility="9.2.000" expanded="true" height="68" name="Retrieve Products" width="90" x="112" y="34"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="repository_entry" value="//Samples/data/Products"/></div><div>&nbsp; &nbsp; &nbsp; </operator></div><div>&nbsp; &nbsp; &nbsp; <operator activated="true" class="concurrency:loop_attributes" compatibility="9.2.000" expanded="true" height="103" name="Loop Attributes" width="90" x="313" y="34"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="attribute_filter_type" value="subset"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="attribute" value=""/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="attributes" value="Product ID"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="use_except_expression" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="value_type" value="attribute_value"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="use_value_type_exception" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="except_value_type" value="time"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="block_type" value="attribute_block"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="use_block_type_exception" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="except_block_type" value="value_matrix_row_start"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="invert_selection" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="include_special_attributes" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="attribute_name_macro" value="loop_attribute"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="reuse_results" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <parameter key="enable_parallel_execution" value="true"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <process expanded="true"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <operator activated="true" class="set_role" compatibility="9.2.000" expanded="true" height="82" name="Set Role" width="90" x="45" y="34"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="attribute_name" value="%{loop_attribute}"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="target_role" value="label"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <list key="set_additional_roles"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </operator></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <operator activated="true" class="legacy:read_model" compatibility="9.2.000" expanded="true" height="68" name="Read Model" width="90" x="45" y="136"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="model_file" value="%{loop_attribute}_NewFeatures.mod"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </operator></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <operator activated="true" class="apply_model" compatibility="9.2.000" expanded="true" height="82" name="Apply Model (2)" width="90" x="179" y="85"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <list key="application_parameters"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="create_view" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </operator></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <operator activated="true" class="annotate" compatibility="9.2.000" expanded="true" height="68" name="Annotate" width="90" x="313" y="85"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <list key="annotations"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="Product" value="%{loop_attribute}"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </list></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="duplicate_annotations" value="overwrite"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </operator></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <operator activated="true" class="performance_binominal_classification" compatibility="9.2.000" expanded="true" height="82" name="Performance (Test Set)" width="90" x="447" y="85"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="main_criterion" value="first"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="accuracy" value="true"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="classification_error" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="kappa" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="AUC (optimistic)" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="AUC" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="AUC (pessimistic)" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="precision" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="recall" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="lift" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="fallout" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="f_measure" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="false_positive" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="false_negative" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="true_positive" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="true_negative" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="sensitivity" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="specificity" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="youden" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="positive_predictive_value" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="negative_predictive_value" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="psep" value="false"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="skip_undefined_labels" value="true"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="use_example_weights" value="true"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </operator></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <operator activated="true" class="annotate" compatibility="9.2.000" expanded="true" height="68" name="Annotate (2)" width="90" x="581" y="34"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <list key="annotations"></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="Product" value="%{loop_attribute}"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </list></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <parameter key="duplicate_annotations" value="overwrite"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </operator></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <connect from_port="input 1" to_op="Set Role" to_port="example set input"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <connect from_op="Set Role" from_port="example set output" to_op="Apply Model (2)" to_port="unlabelled data"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <connect from_op="Read Model" from_port="output" to_op="Apply Model (2)" to_port="model"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Annotate" to_port="input"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <connect from_op="Annotate" from_port="output" to_op="Performance (Test Set)" to_port="labelled data"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <connect from_op="Performance (Test Set)" from_port="performance" to_op="Annotate (2)" to_port="input"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <connect from_op="Performance (Test Set)" from_port="example set" to_port="output 2"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <connect from_op="Annotate (2)" from_port="output" to_port="output 1"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <portSpacing port="source_input 1" spacing="0"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <portSpacing port="source_input 2" spacing="0"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <portSpacing port="sink_output 1" spacing="0"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <portSpacing port="sink_output 2" spacing="0"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <portSpacing port="sink_output 3" spacing="0"/></div><div>&nbsp; &nbsp; &nbsp; &nbsp; </process></div><div>&nbsp; &nbsp; &nbsp; &nbsp; <description align="center" color="transparent" colored="false" width="126">I looped on attribute because all my products were in a different column</description></div><div>&nbsp; &nbsp; &nbsp; </operator></div><div>&nbsp; &nbsp; &nbsp; <connect from_op="Retrieve Products" from_port="output" to_op="Loop Attributes" to_port="input 1"/></div><div>&nbsp; &nbsp; &nbsp; <connect from_op="Loop Attributes" from_port="output 1" to_port="result 1"/></div><div>&nbsp; &nbsp; &nbsp; <connect from_op="Loop Attributes" from_port="output 2" to_port="result 2"/></div><div>&nbsp; &nbsp; &nbsp; <portSpacing port="source_input 1" spacing="0"/></div><div>&nbsp; &nbsp; &nbsp; <portSpacing port="sink_result 1" spacing="0"/></div><div>&nbsp; &nbsp; &nbsp; <portSpacing port="sink_result 2" spacing="0"/></div><div>&nbsp; &nbsp; &nbsp; <portSpacing port="sink_result 3" spacing="0"/></div><div>&nbsp; &nbsp; </process></div><div>&nbsp; </operator></div><div></process></div>

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist
    Hi,
    did you have a look at Polynominal by Binominal classification? otherwise you can build something with Loop Values.

    BR,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager
  • SGolbertSGolbert RapidMiner Certified Analyst, Member Posts: 344 Unicorn

    I'm happy that you could find an answer. I have some questions about the use case: Why do you need individual classifiers for each product? Is it possible for a sample to be a member of more than one product cathegory?

    Regards,
    Sebastian

Sign In or Register to comment.