Options

Model that combines multiple binary classes into a single class

woxepewoxepe Member Posts: 1 Newbie
I have data that is classified into 4 classes. For example, let's say they're the following:

ab
a
b
nonab

As you can see, these four classes are just the Cartesian product of the binary classes of a/non-a and b/non-b.

What I want to do is use a pair of SVMs to predict a and b separately and then recombine the result back into the classes above. I have figured out how to reclass the data and have an SVM for each class, but I can't figure out how to recombine the results back into a single model.

I was looking at applying the model to the input data set, joining it, and then maybe feeding that to a Decision Tree (trained on a simple 4 row data set that mapped the combinations to the classes above) somehow, but I couldn't see how to turn it back into an actual model once I had applied it, which is required to make use of the Cross Validation process.

Example process file:

<?xml version="1.0" encoding="UTF-8"?><process version="9.4.001"><br>  <context><br>    <input/><br>    <output/><br>    <macros/><br>  </context><br>  <operator activated="true" class="process" compatibility="9.4.001" expanded="true" name="Process"><br>    <parameter key="logverbosity" value="init"/><br>    <parameter key="random_seed" value="2001"/><br>    <parameter key="send_mail" value="never"/><br>    <parameter key="notification_email" value=""/><br>    <parameter key="process_duration_for_mail" value="30"/><br>    <parameter key="encoding" value="UTF-8"/><br>    <process expanded="true"><br>      <operator activated="true" class="utility:create_exampleset" compatibility="9.4.001" expanded="true" height="68" name="Create ExampleSet" width="90" x="45" y="34"><br>        <parameter key="generator_type" value="attribute functions"/><br>        <parameter key="number_of_examples" value="1000"/><br>        <parameter key="use_stepsize" value="false"/><br>        <list key="function_descriptions"><br>          <parameter key="class" value="if(rand() &lt; 0.25, &quot;ab&quot;, if(rand() &lt; 1/3, &quot;nonab&quot;, if(rand() &lt; 0.5, &quot;a&quot;, &quot;b&quot;)))"/><br>          <parameter key="f1" value="rand()"/><br>          <parameter key="f2" value="2*rand() - 1"/><br>          <parameter key="f3" value="rand()"/><br>          <parameter key="f4" value="rand()"/><br>        </list><br>        <parameter key="add_id_attribute" value="true"/><br>        <list key="numeric_series_configuration"/><br>        <list key="date_series_configuration"/><br>        <list key="date_series_configuration (interval)"/><br>        <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/><br>        <parameter key="time_zone" value="SYSTEM"/><br>        <parameter key="column_separator" value=","/><br>        <parameter key="parse_all_as_nominal" value="false"/><br>        <parameter key="decimal_point_character" value="."/><br>        <parameter key="trim_attribute_names" value="true"/><br>      </operator><br>      <operator activated="true" class="set_role" compatibility="9.4.001" expanded="true" height="82" name="Set Role" width="90" x="179" y="34"><br>        <parameter key="attribute_name" value="id"/><br>        <parameter key="target_role" value="id"/><br>        <list key="set_additional_roles"><br>          <parameter key="class" value="label"/><br>        </list><br>      </operator><br>      <operator activated="true" class="multiply" compatibility="9.4.001" expanded="true" height="103" name="Multiply" width="90" x="313" y="34"/><br>      <operator activated="true" class="map" compatibility="9.4.001" expanded="true" height="82" name="Map b" width="90" x="447" y="238"><br>        <parameter key="attribute_filter_type" value="single"/><br>        <parameter key="attribute" value="class"/><br>        <parameter key="attributes" value=""/><br>        <parameter key="use_except_expression" value="false"/><br>        <parameter key="value_type" value="attribute_value"/><br>        <parameter key="use_value_type_exception" value="false"/><br>        <parameter key="except_value_type" value="time"/><br>        <parameter key="block_type" value="attribute_block"/><br>        <parameter key="use_block_type_exception" value="false"/><br>        <parameter key="except_block_type" value="value_matrix_row_start"/><br>        <parameter key="invert_selection" value="false"/><br>        <parameter key="include_special_attributes" value="true"/><br>        <list key="value_mappings"><br>          <parameter key="ab" value="yes"/><br>          <parameter key="a" value="no"/><br>          <parameter key="b" value="yes"/><br>          <parameter key="nonab" value="no"/><br>        </list><br>        <parameter key="consider_regular_expressions" value="false"/><br>        <parameter key="add_default_mapping" value="false"/><br>      </operator><br>      <operator activated="true" class="support_vector_machine" compatibility="9.4.001" expanded="true" height="124" name="SVM b" width="90" x="648" y="238"><br>        <parameter key="kernel_type" value="dot"/><br>        <parameter key="kernel_gamma" value="1.0"/><br>        <parameter key="kernel_sigma1" value="1.0"/><br>        <parameter key="kernel_sigma2" value="0.0"/><br>        <parameter key="kernel_sigma3" value="2.0"/><br>        <parameter key="kernel_shift" value="1.0"/><br>        <parameter key="kernel_degree" value="2.0"/><br>        <parameter key="kernel_a" value="1.0"/><br>        <parameter key="kernel_b" value="0.0"/><br>        <parameter key="kernel_cache" value="200"/><br>        <parameter key="C" value="0.0"/><br>        <parameter key="convergence_epsilon" value="0.001"/><br>        <parameter key="max_iterations" value="100000"/><br>        <parameter key="scale" value="true"/><br>        <parameter key="calculate_weights" value="true"/><br>        <parameter key="return_optimization_performance" value="true"/><br>        <parameter key="L_pos" value="1.0"/><br>        <parameter key="L_neg" value="1.0"/><br>        <parameter key="epsilon" value="0.0"/><br>        <parameter key="epsilon_plus" value="0.0"/><br>        <parameter key="epsilon_minus" value="0.0"/><br>        <parameter key="balance_cost" value="false"/><br>        <parameter key="quadratic_loss_pos" value="false"/><br>        <parameter key="quadratic_loss_neg" value="false"/><br>        <parameter key="estimate_performance" value="false"/><br>      </operator><br>      <operator activated="true" class="map" compatibility="9.4.001" expanded="true" height="82" name="Map a" width="90" x="447" y="34"><br>        <parameter key="attribute_filter_type" value="single"/><br>        <parameter key="attribute" value="class"/><br>        <parameter key="attributes" value=""/><br>        <parameter key="use_except_expression" value="false"/><br>        <parameter key="value_type" value="attribute_value"/><br>        <parameter key="use_value_type_exception" value="false"/><br>        <parameter key="except_value_type" value="time"/><br>        <parameter key="block_type" value="attribute_block"/><br>        <parameter key="use_block_type_exception" value="false"/><br>        <parameter key="except_block_type" value="value_matrix_row_start"/><br>        <parameter key="invert_selection" value="false"/><br>        <parameter key="include_special_attributes" value="true"/><br>        <list key="value_mappings"><br>          <parameter key="ab" value="yes"/><br>          <parameter key="a" value="yes"/><br>          <parameter key="b" value="no"/><br>          <parameter key="nonab" value="no"/><br>        </list><br>        <parameter key="consider_regular_expressions" value="false"/><br>        <parameter key="add_default_mapping" value="false"/><br>      </operator><br>      <operator activated="true" class="support_vector_machine" compatibility="9.4.001" expanded="true" height="124" name="SVM a" width="90" x="648" y="34"><br>        <parameter key="kernel_type" value="dot"/><br>        <parameter key="kernel_gamma" value="1.0"/><br>        <parameter key="kernel_sigma1" value="1.0"/><br>        <parameter key="kernel_sigma2" value="0.0"/><br>        <parameter key="kernel_sigma3" value="2.0"/><br>        <parameter key="kernel_shift" value="1.0"/><br>        <parameter key="kernel_degree" value="2.0"/><br>        <parameter key="kernel_a" value="1.0"/><br>        <parameter key="kernel_b" value="0.0"/><br>        <parameter key="kernel_cache" value="200"/><br>        <parameter key="C" value="0.0"/><br>        <parameter key="convergence_epsilon" value="0.001"/><br>        <parameter key="max_iterations" value="100000"/><br>        <parameter key="scale" value="true"/><br>        <parameter key="calculate_weights" value="true"/><br>        <parameter key="return_optimization_performance" value="true"/><br>        <parameter key="L_pos" value="1.0"/><br>        <parameter key="L_neg" value="1.0"/><br>        <parameter key="epsilon" value="0.0"/><br>        <parameter key="epsilon_plus" value="0.0"/><br>        <parameter key="epsilon_minus" value="0.0"/><br>        <parameter key="balance_cost" value="false"/><br>        <parameter key="quadratic_loss_pos" value="false"/><br>        <parameter key="quadratic_loss_neg" value="false"/><br>        <parameter key="estimate_performance" value="false"/><br>      </operator><br>      <connect from_op="Create ExampleSet" from_port="output" to_op="Set Role" to_port="example set input"/><br>      <connect from_op="Set Role" from_port="example set output" to_op="Multiply" to_port="input"/><br>      <connect from_op="Multiply" from_port="output 1" to_op="Map a" to_port="example set input"/><br>      <connect from_op="Multiply" from_port="output 2" to_op="Map b" to_port="example set input"/><br>      <connect from_op="Map b" from_port="example set output" to_op="SVM b" to_port="training set"/><br>      <connect from_op="SVM b" from_port="model" to_port="result 3"/><br>      <connect from_op="SVM b" from_port="exampleSet" to_port="result 4"/><br>      <connect from_op="Map a" from_port="example set output" to_op="SVM a" to_port="training set"/><br>      <connect from_op="SVM a" from_port="model" to_port="result 1"/><br>      <connect from_op="SVM a" from_port="exampleSet" to_port="result 2"/><br>      <portSpacing port="source_input 1" spacing="0"/><br>      <portSpacing port="sink_result 1" spacing="0"/><br>      <portSpacing port="sink_result 2" spacing="0"/><br>      <portSpacing port="sink_result 3" spacing="0"/><br>      <portSpacing port="sink_result 4" spacing="0"/><br>      <portSpacing port="sink_result 5" spacing="0"/><br>    </process><br>  </operator><br></process><br>
How can I recombine these two classifiers back into a single model with a deterministic combining step?

Answers

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,510 RM Data Scientist
    Hi,
    i think one way of doing this is to use Hirachical Classification and define a and b as subclasses of ab?

    BR,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.