🥳 RAPIDMINER 9.9 IS OUT!!! 🥳

The updates in 9.9 power advanced use cases and offer productivity enhancements for users who prefer to code.

CLICK HERE TO DOWNLOAD

Random KNN (RKNN) Model

faisal63455faisal63455 Member Posts: 2 Newbie
edited April 4 in Help
Hi All!

I am a newbie here and would appreciate if some of you experts can assist me in these two questions:

1) I would like to know if I can create a random KNN (RKNN) model in RapidMiner? Here is the title of an interesting paper on how this model works: Random KNN feature selection - a fast and stable alternative to Random Forests (I can not post links yet on my posts, but if you paste this title on Google you should get the paper). I believe model is essentially a Random Forest with KNN predictors? If so, can I build this model in RapidMiner?

2) Is there a way or an operator to build a KD Tree model?


Appreciate your support!

Thanks,
Faisal


Best Answer

  • yyhuangyyhuang Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 305  RM Data Scientist
    edited April 5 Solution Accepted
    Hi @faisal63455,

    1) we do not have an operator for random KNN but you can easily build it and reuse it as custom operator. The implementation is done by the random selection of attributes, and ensemble models. You can use the simple version deep learning from H2O or any other framework of neural networks. My sample process as follows
    <?xml version="1.0" encoding="UTF-8"?><process version="9.9.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.4.000" expanded="true" name="Process" origin="GENERATED_TUTORIAL">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="retrieve" compatibility="9.9.000" expanded="true" height="68" name="Sonar" origin="GENERATED_TUTORIAL" width="90" x="45" y="34">
            <parameter key="repository_entry" value="//Samples/data/Sonar"/>
          </operator>
          <operator activated="true" class="multiply" compatibility="9.9.000" expanded="true" height="103" name="Multiply" width="90" x="179" y="34"/>
          <operator activated="true" class="split_validation" compatibility="9.9.000" expanded="true" height="124" name="Validation" origin="GENERATED_TUTORIAL" width="90" x="313" y="34">
            <parameter key="create_complete_model" value="false"/>
            <parameter key="split" value="relative"/>
            <parameter key="split_ratio" value="0.7"/>
            <parameter key="training_set_size" value="100"/>
            <parameter key="test_set_size" value="-1"/>
            <parameter key="sampling_type" value="automatic"/>
            <parameter key="use_local_random_seed" value="false"/>
            <parameter key="local_random_seed" value="1992"/>
            <process expanded="true">
              <operator activated="true" class="vote" compatibility="9.9.000" expanded="true" height="68" name="Vote" origin="GENERATED_TUTORIAL" width="90" x="112" y="34">
                <process expanded="true">
                  <operator activated="true" class="select_by_random" compatibility="9.9.000" expanded="true" height="82" name="Select by Random" width="90" x="112" y="34">
                    <parameter key="use_fixed_number_of_attributes" value="true"/>
                    <parameter key="number_of_attributes" value="35"/>
                    <parameter key="use_local_random_seed" value="false"/>
                    <parameter key="local_random_seed" value="1992"/>
                  </operator>
                  <operator activated="true" class="h2o:deep_learning" compatibility="9.9.000" expanded="true" height="103" name="Deep Learning" width="90" x="246" y="34">
                    <parameter key="activation" value="Rectifier"/>
                    <enumeration key="hidden_layer_sizes">
                      <parameter key="hidden_layer_sizes" value="50"/>
                      <parameter key="hidden_layer_sizes" value="50"/>
                    </enumeration>
                    <enumeration key="hidden_dropout_ratios"/>
                    <parameter key="reproducible_(uses_1_thread)" value="false"/>
                    <parameter key="use_local_random_seed" value="false"/>
                    <parameter key="local_random_seed" value="1992"/>
                    <parameter key="epochs" value="10.0"/>
                    <parameter key="compute_variable_importances" value="false"/>
                    <parameter key="train_samples_per_iteration" value="-2"/>
                    <parameter key="adaptive_rate" value="true"/>
                    <parameter key="epsilon" value="1.0E-8"/>
                    <parameter key="rho" value="0.99"/>
                    <parameter key="learning_rate" value="0.005"/>
                    <parameter key="learning_rate_annealing" value="1.0E-6"/>
                    <parameter key="learning_rate_decay" value="1.0"/>
                    <parameter key="momentum_start" value="0.0"/>
                    <parameter key="momentum_ramp" value="1000000.0"/>
                    <parameter key="momentum_stable" value="0.0"/>
                    <parameter key="nesterov_accelerated_gradient" value="true"/>
                    <parameter key="standardize" value="true"/>
                    <parameter key="L1" value="1.0E-5"/>
                    <parameter key="L2" value="0.0"/>
                    <parameter key="max_w2" value="10.0"/>
                    <parameter key="loss_function" value="Automatic"/>
                    <parameter key="distribution_function" value="AUTO"/>
                    <parameter key="early_stopping" value="false"/>
                    <parameter key="stopping_rounds" value="1"/>
                    <parameter key="stopping_metric" value="AUTO"/>
                    <parameter key="stopping_tolerance" value="0.001"/>
                    <parameter key="missing_values_handling" value="MeanImputation"/>
                    <parameter key="max_runtime_seconds" value="0"/>
                    <list key="expert_parameters"/>
                    <list key="expert_parameters_"/>
                  </operator>
                  <operator activated="true" class="select_by_random" compatibility="9.9.000" expanded="true" height="82" name="Select by Random (2)" width="90" x="112" y="187">
                    <parameter key="use_fixed_number_of_attributes" value="true"/>
                    <parameter key="number_of_attributes" value="35"/>
                    <parameter key="use_local_random_seed" value="false"/>
                    <parameter key="local_random_seed" value="1992"/>
                  </operator>
                  <operator activated="true" class="h2o:deep_learning" compatibility="9.9.000" expanded="true" height="103" name="Deep Learning (2)" width="90" x="246" y="187">
                    <parameter key="activation" value="Rectifier"/>
                    <enumeration key="hidden_layer_sizes">
                      <parameter key="hidden_layer_sizes" value="50"/>
                      <parameter key="hidden_layer_sizes" value="50"/>
                    </enumeration>
                    <enumeration key="hidden_dropout_ratios"/>
                    <parameter key="reproducible_(uses_1_thread)" value="false"/>
                    <parameter key="use_local_random_seed" value="false"/>
                    <parameter key="local_random_seed" value="1992"/>
                    <parameter key="epochs" value="10.0"/>
                    <parameter key="compute_variable_importances" value="false"/>
                    <parameter key="train_samples_per_iteration" value="-2"/>
                    <parameter key="adaptive_rate" value="true"/>
                    <parameter key="epsilon" value="1.0E-8"/>
                    <parameter key="rho" value="0.99"/>
                    <parameter key="learning_rate" value="0.005"/>
                    <parameter key="learning_rate_annealing" value="1.0E-6"/>
                    <parameter key="learning_rate_decay" value="1.0"/>
                    <parameter key="momentum_start" value="0.0"/>
                    <parameter key="momentum_ramp" value="1000000.0"/>
                    <parameter key="momentum_stable" value="0.0"/>
                    <parameter key="nesterov_accelerated_gradient" value="true"/>
                    <parameter key="standardize" value="true"/>
                    <parameter key="L1" value="1.0E-5"/>
                    <parameter key="L2" value="0.0"/>
                    <parameter key="max_w2" value="10.0"/>
                    <parameter key="loss_function" value="Automatic"/>
                    <parameter key="distribution_function" value="AUTO"/>
                    <parameter key="early_stopping" value="false"/>
                    <parameter key="stopping_rounds" value="1"/>
                    <parameter key="stopping_metric" value="AUTO"/>
                    <parameter key="stopping_tolerance" value="0.001"/>
                    <parameter key="missing_values_handling" value="MeanImputation"/>
                    <parameter key="max_runtime_seconds" value="0"/>
                    <list key="expert_parameters"/>
                    <list key="expert_parameters_"/>
                  </operator>
                  <operator activated="true" class="select_by_random" compatibility="9.9.000" expanded="true" height="82" name="Select by Random (3)" width="90" x="112" y="340">
                    <parameter key="use_fixed_number_of_attributes" value="true"/>
                    <parameter key="number_of_attributes" value="35"/>
                    <parameter key="use_local_random_seed" value="false"/>
                    <parameter key="local_random_seed" value="1992"/>
                  </operator>
                  <operator activated="true" class="h2o:deep_learning" compatibility="9.9.000" expanded="true" height="103" name="Deep Learning (3)" width="90" x="246" y="340">
                    <parameter key="activation" value="Rectifier"/>
                    <enumeration key="hidden_layer_sizes">
                      <parameter key="hidden_layer_sizes" value="50"/>
                      <parameter key="hidden_layer_sizes" value="50"/>
                    </enumeration>
                    <enumeration key="hidden_dropout_ratios"/>
                    <parameter key="reproducible_(uses_1_thread)" value="false"/>
                    <parameter key="use_local_random_seed" value="false"/>
                    <parameter key="local_random_seed" value="1992"/>
                    <parameter key="epochs" value="10.0"/>
                    <parameter key="compute_variable_importances" value="false"/>
                    <parameter key="train_samples_per_iteration" value="-2"/>
                    <parameter key="adaptive_rate" value="true"/>
                    <parameter key="epsilon" value="1.0E-8"/>
                    <parameter key="rho" value="0.99"/>
                    <parameter key="learning_rate" value="0.005"/>
                    <parameter key="learning_rate_annealing" value="1.0E-6"/>
                    <parameter key="learning_rate_decay" value="1.0"/>
                    <parameter key="momentum_start" value="0.0"/>
                    <parameter key="momentum_ramp" value="1000000.0"/>
                    <parameter key="momentum_stable" value="0.0"/>
                    <parameter key="nesterov_accelerated_gradient" value="true"/>
                    <parameter key="standardize" value="true"/>
                    <parameter key="L1" value="1.0E-5"/>
                    <parameter key="L2" value="0.0"/>
                    <parameter key="max_w2" value="10.0"/>
                    <parameter key="loss_function" value="Automatic"/>
                    <parameter key="distribution_function" value="AUTO"/>
                    <parameter key="early_stopping" value="false"/>
                    <parameter key="stopping_rounds" value="1"/>
                    <parameter key="stopping_metric" value="AUTO"/>
                    <parameter key="stopping_tolerance" value="0.001"/>
                    <parameter key="missing_values_handling" value="MeanImputation"/>
                    <parameter key="max_runtime_seconds" value="0"/>
                    <list key="expert_parameters"/>
                    <list key="expert_parameters_"/>
                  </operator>
                  <operator activated="true" class="select_by_random" compatibility="9.9.000" expanded="true" height="82" name="Select by Random (4)" width="90" x="112" y="493">
                    <parameter key="use_fixed_number_of_attributes" value="true"/>
                    <parameter key="number_of_attributes" value="35"/>
                    <parameter key="use_local_random_seed" value="false"/>
                    <parameter key="local_random_seed" value="1992"/>
                  </operator>
                  <operator activated="true" class="h2o:deep_learning" compatibility="9.9.000" expanded="true" height="103" name="Deep Learning (4)" width="90" x="246" y="493">
                    <parameter key="activation" value="Rectifier"/>
                    <enumeration key="hidden_layer_sizes">
                      <parameter key="hidden_layer_sizes" value="50"/>
                      <parameter key="hidden_layer_sizes" value="50"/>
                    </enumeration>
                    <enumeration key="hidden_dropout_ratios"/>
                    <parameter key="reproducible_(uses_1_thread)" value="false"/>
                    <parameter key="use_local_random_seed" value="false"/>
                    <parameter key="local_random_seed" value="1992"/>
                    <parameter key="epochs" value="10.0"/>
                    <parameter key="compute_variable_importances" value="false"/>
                    <parameter key="train_samples_per_iteration" value="-2"/>
                    <parameter key="adaptive_rate" value="true"/>
                    <parameter key="epsilon" value="1.0E-8"/>
                    <parameter key="rho" value="0.99"/>
                    <parameter key="learning_rate" value="0.005"/>
                    <parameter key="learning_rate_annealing" value="1.0E-6"/>
                    <parameter key="learning_rate_decay" value="1.0"/>
                    <parameter key="momentum_start" value="0.0"/>
                    <parameter key="momentum_ramp" value="1000000.0"/>
                    <parameter key="momentum_stable" value="0.0"/>
                    <parameter key="nesterov_accelerated_gradient" value="true"/>
                    <parameter key="standardize" value="true"/>
                    <parameter key="L1" value="1.0E-5"/>
                    <parameter key="L2" value="0.0"/>
                    <parameter key="max_w2" value="10.0"/>
                    <parameter key="loss_function" value="Automatic"/>
                    <parameter key="distribution_function" value="AUTO"/>
                    <parameter key="early_stopping" value="false"/>
                    <parameter key="stopping_rounds" value="1"/>
                    <parameter key="stopping_metric" value="AUTO"/>
                    <parameter key="stopping_tolerance" value="0.001"/>
                    <parameter key="missing_values_handling" value="MeanImputation"/>
                    <parameter key="max_runtime_seconds" value="0"/>
                    <list key="expert_parameters"/>
                    <list key="expert_parameters_"/>
                  </operator>
                  <connect from_port="training set 1" to_op="Select by Random" to_port="example set input"/>
                  <connect from_port="training set 2" to_op="Select by Random (2)" to_port="example set input"/>
                  <connect from_port="training set 3" to_op="Select by Random (3)" to_port="example set input"/>
                  <connect from_port="training set 4" to_op="Select by Random (4)" to_port="example set input"/>
                  <connect from_op="Select by Random" from_port="example set output" to_op="Deep Learning" to_port="training set"/>
                  <connect from_op="Deep Learning" from_port="model" to_port="base model 1"/>
                  <connect from_op="Select by Random (2)" from_port="example set output" to_op="Deep Learning (2)" to_port="training set"/>
                  <connect from_op="Deep Learning (2)" from_port="model" to_port="base model 2"/>
                  <connect from_op="Select by Random (3)" from_port="example set output" to_op="Deep Learning (3)" to_port="training set"/>
                  <connect from_op="Deep Learning (3)" from_port="model" to_port="base model 3"/>
                  <connect from_op="Select by Random (4)" from_port="example set output" to_op="Deep Learning (4)" to_port="training set"/>
                  <connect from_op="Deep Learning (4)" from_port="model" to_port="base model 4"/>
                  <portSpacing port="source_training set 1" spacing="0"/>
                  <portSpacing port="source_training set 2" spacing="72"/>
                  <portSpacing port="source_training set 3" spacing="0"/>
                  <portSpacing port="source_training set 4" spacing="0"/>
                  <portSpacing port="source_training set 5" spacing="0"/>
                  <portSpacing port="sink_base model 1" spacing="72"/>
                  <portSpacing port="sink_base model 2" spacing="72"/>
                  <portSpacing port="sink_base model 3" spacing="0"/>
                  <portSpacing port="sink_base model 4" spacing="0"/>
                  <portSpacing port="sink_base model 5" spacing="0"/>
                </process>
              </operator>
              <connect from_port="training" to_op="Vote" to_port="training set"/>
              <connect from_op="Vote" from_port="model" to_port="model"/>
              <portSpacing port="source_training" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true">
              <operator activated="true" class="apply_model" compatibility="9.9.000" expanded="true" height="82" name="Apply Model" origin="GENERATED_TUTORIAL" width="90" x="45" y="34">
                <list key="application_parameters"/>
                <parameter key="create_view" value="false"/>
              </operator>
              <operator activated="true" class="performance" compatibility="9.9.000" expanded="true" height="82" name="Performance" origin="GENERATED_TUTORIAL" width="90" x="179" y="34">
                <parameter key="use_example_weights" value="true"/>
              </operator>
              <connect from_port="model" to_op="Apply Model" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
              <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
              <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_averagable 1" spacing="0"/>
              <portSpacing port="sink_averagable 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="split_validation" compatibility="9.9.000" expanded="true" height="124" name="Validation (2)" origin="GENERATED_TUTORIAL" width="90" x="313" y="289">
            <parameter key="create_complete_model" value="false"/>
            <parameter key="split" value="relative"/>
            <parameter key="split_ratio" value="0.7"/>
            <parameter key="training_set_size" value="100"/>
            <parameter key="test_set_size" value="-1"/>
            <parameter key="sampling_type" value="automatic"/>
            <parameter key="use_local_random_seed" value="false"/>
            <parameter key="local_random_seed" value="1992"/>
            <process expanded="true">
              <operator activated="true" class="stacking" compatibility="9.9.000" expanded="true" height="68" name="Stacking" width="90" x="112" y="34">
                <parameter key="keep_all_attributes" value="true"/>
                <parameter key="keep_confidences" value="false"/>
                <process expanded="true">
                  <operator activated="true" class="select_by_random" compatibility="9.9.000" expanded="true" height="82" name="Select by Random (5)" width="90" x="112" y="34">
                    <parameter key="use_fixed_number_of_attributes" value="true"/>
                    <parameter key="number_of_attributes" value="35"/>
                    <parameter key="use_local_random_seed" value="false"/>
                    <parameter key="local_random_seed" value="1992"/>
                  </operator>
                  <operator activated="true" class="h2o:deep_learning" compatibility="9.9.000" expanded="true" height="103" name="Deep Learning (5)" width="90" x="246" y="34">
                    <parameter key="activation" value="Rectifier"/>
                    <enumeration key="hidden_layer_sizes">
                      <parameter key="hidden_layer_sizes" value="50"/>
                      <parameter key="hidden_layer_sizes" value="50"/>
                    </enumeration>
                    <enumeration key="hidden_dropout_ratios"/>
                    <parameter key="reproducible_(uses_1_thread)" value="false"/>
                    <parameter key="use_local_random_seed" value="false"/>
                    <parameter key="local_random_seed" value="1992"/>
                    <parameter key="epochs" value="10.0"/>
                    <parameter key="compute_variable_importances" value="false"/>
                    <parameter key="train_samples_per_iteration" value="-2"/>
                    <parameter key="adaptive_rate" value="true"/>
                    <parameter key="epsilon" value="1.0E-8"/>
                    <parameter key="rho" value="0.99"/>
                    <parameter key="learning_rate" value="0.005"/>
                    <parameter key="learning_rate_annealing" value="1.0E-6"/>
                    <parameter key="learning_rate_decay" value="1.0"/>
                    <parameter key="momentum_start" value="0.0"/>
                    <parameter key="momentum_ramp" value="1000000.0"/>
                    <parameter key="momentum_stable" value="0.0"/>
                    <parameter key="nesterov_accelerated_gradient" value="true"/>
                    <parameter key="standardize" value="true"/>
                    <parameter key="L1" value="1.0E-5"/>
                    <parameter key="L2" value="0.0"/>
                    <parameter key="max_w2" value="10.0"/>
                    <parameter key="loss_function" value="Automatic"/>
                    <parameter key="distribution_function" value="AUTO"/>
                    <parameter key="early_stopping" value="false"/>
                    <parameter key="stopping_rounds" value="1"/>
                    <parameter key="stopping_metric" value="AUTO"/>
                    <parameter key="stopping_tolerance" value="0.001"/>
                    <parameter key="missing_values_handling" value="MeanImputation"/>
                    <parameter key="max_runtime_seconds" value="0"/>
                    <list key="expert_parameters"/>
                    <list key="expert_parameters_"/>
                  </operator>
                  <operator activated="true" class="select_by_random" compatibility="9.9.000" expanded="true" height="82" name="Select by Random (6)" width="90" x="112" y="187">
                    <parameter key="use_fixed_number_of_attributes" value="true"/>
                    <parameter key="number_of_attributes" value="35"/>
                    <parameter key="use_local_random_seed" value="false"/>
                    <parameter key="local_random_seed" value="1992"/>
                  </operator>
                  <operator activated="true" class="h2o:deep_learning" compatibility="9.9.000" expanded="true" height="103" name="Deep Learning (6)" width="90" x="246" y="187">
                    <parameter key="activation" value="Rectifier"/>
                    <enumeration key="hidden_layer_sizes">
                      <parameter key="hidden_layer_sizes" value="50"/>
                      <parameter key="hidden_layer_sizes" value="50"/>
                    </enumeration>
                    <enumeration key="hidden_dropout_ratios"/>
                    <parameter key="reproducible_(uses_1_thread)" value="false"/>
                    <parameter key="use_local_random_seed" value="false"/>
                    <parameter key="local_random_seed" value="1992"/>
                    <parameter key="epochs" value="10.0"/>
                    <parameter key="compute_variable_importances" value="false"/>
                    <parameter key="train_samples_per_iteration" value="-2"/>
                    <parameter key="adaptive_rate" value="true"/>
                    <parameter key="epsilon" value="1.0E-8"/>
                    <parameter key="rho" value="0.99"/>
                    <parameter key="learning_rate" value="0.005"/>
                    <parameter key="learning_rate_annealing" value="1.0E-6"/>
                    <parameter key="learning_rate_decay" value="1.0"/>
                    <parameter key="momentum_start" value="0.0"/>
                    <parameter key="momentum_ramp" value="1000000.0"/>
                    <parameter key="momentum_stable" value="0.0"/>
                    <parameter key="nesterov_accelerated_gradient" value="true"/>
                    <parameter key="standardize" value="true"/>
                    <parameter key="L1" value="1.0E-5"/>
                    <parameter key="L2" value="0.0"/>
                    <parameter key="max_w2" value="10.0"/>
                    <parameter key="loss_function" value="Automatic"/>
                    <parameter key="distribution_function" value="AUTO"/>
                    <parameter key="early_stopping" value="false"/>
                    <parameter key="stopping_rounds" value="1"/>
                    <parameter key="stopping_metric" value="AUTO"/>
                    <parameter key="stopping_tolerance" value="0.001"/>
                    <parameter key="missing_values_handling" value="MeanImputation"/>
                    <parameter key="max_runtime_seconds" value="0"/>
                    <list key="expert_parameters"/>
                    <list key="expert_parameters_"/>
                  </operator>
                  <operator activated="true" class="select_by_random" compatibility="9.9.000" expanded="true" height="82" name="Select by Random (7)" width="90" x="112" y="340">
                    <parameter key="use_fixed_number_of_attributes" value="true"/>
                    <parameter key="number_of_attributes" value="35"/>
                    <parameter key="use_local_random_seed" value="false"/>
                    <parameter key="local_random_seed" value="1992"/>
                  </operator>
                  <operator activated="true" class="h2o:deep_learning" compatibility="9.9.000" expanded="true" height="103" name="Deep Learning (7)" width="90" x="246" y="340">
                    <parameter key="activation" value="Rectifier"/>
                    <enumeration key="hidden_layer_sizes">
                      <parameter key="hidden_layer_sizes" value="50"/>
                      <parameter key="hidden_layer_sizes" value="50"/>
                    </enumeration>
                    <enumeration key="hidden_dropout_ratios"/>
                    <parameter key="reproducible_(uses_1_thread)" value="false"/>
                    <parameter key="use_local_random_seed" value="false"/>
                    <parameter key="local_random_seed" value="1992"/>
                    <parameter key="epochs" value="10.0"/>
                    <parameter key="compute_variable_importances" value="false"/>
                    <parameter key="train_samples_per_iteration" value="-2"/>
                    <parameter key="adaptive_rate" value="true"/>
                    <parameter key="epsilon" value="1.0E-8"/>
                    <parameter key="rho" value="0.99"/>
                    <parameter key="learning_rate" value="0.005"/>
                    <parameter key="learning_rate_annealing" value="1.0E-6"/>
                    <parameter key="learning_rate_decay" value="1.0"/>
                    <parameter key="momentum_start" value="0.0"/>
                    <parameter key="momentum_ramp" value="1000000.0"/>
                    <parameter key="momentum_stable" value="0.0"/>
                    <parameter key="nesterov_accelerated_gradient" value="true"/>
                    <parameter key="standardize" value="true"/>
                    <parameter key="L1" value="1.0E-5"/>
                    <parameter key="L2" value="0.0"/>
                    <parameter key="max_w2" value="10.0"/>
                    <parameter key="loss_function" value="Automatic"/>
                    <parameter key="distribution_function" value="AUTO"/>
                    <parameter key="early_stopping" value="false"/>
                    <parameter key="stopping_rounds" value="1"/>
                    <parameter key="stopping_metric" value="AUTO"/>
                    <parameter key="stopping_tolerance" value="0.001"/>
                    <parameter key="missing_values_handling" value="MeanImputation"/>
                    <parameter key="max_runtime_seconds" value="0"/>
                    <list key="expert_parameters"/>
                    <list key="expert_parameters_"/>
                  </operator>
                  <operator activated="true" class="select_by_random" compatibility="9.9.000" expanded="true" height="82" name="Select by Random (8)" width="90" x="112" y="493">
                    <parameter key="use_fixed_number_of_attributes" value="true"/>
                    <parameter key="number_of_attributes" value="35"/>
                    <parameter key="use_local_random_seed" value="false"/>
                    <parameter key="local_random_seed" value="1992"/>
                  </operator>
                  <operator activated="true" class="h2o:deep_learning" compatibility="9.9.000" expanded="true" height="103" name="Deep Learning (8)" width="90" x="246" y="493">
                    <parameter key="activation" value="Rectifier"/>
                    <enumeration key="hidden_layer_sizes">
                      <parameter key="hidden_layer_sizes" value="50"/>
                      <parameter key="hidden_layer_sizes" value="50"/>
                    </enumeration>
                    <enumeration key="hidden_dropout_ratios"/>
                    <parameter key="reproducible_(uses_1_thread)" value="false"/>
                    <parameter key="use_local_random_seed" value="false"/>
                    <parameter key="local_random_seed" value="1992"/>
                    <parameter key="epochs" value="10.0"/>
                    <parameter key="compute_variable_importances" value="false"/>
                    <parameter key="train_samples_per_iteration" value="-2"/>
                    <parameter key="adaptive_rate" value="true"/>
                    <parameter key="epsilon" value="1.0E-8"/>
                    <parameter key="rho" value="0.99"/>
                    <parameter key="learning_rate" value="0.005"/>
                    <parameter key="learning_rate_annealing" value="1.0E-6"/>
                    <parameter key="learning_rate_decay" value="1.0"/>
                    <parameter key="momentum_start" value="0.0"/>
                    <parameter key="momentum_ramp" value="1000000.0"/>
                    <parameter key="momentum_stable" value="0.0"/>
                    <parameter key="nesterov_accelerated_gradient" value="true"/>
                    <parameter key="standardize" value="true"/>
                    <parameter key="L1" value="1.0E-5"/>
                    <parameter key="L2" value="0.0"/>
                    <parameter key="max_w2" value="10.0"/>
                    <parameter key="loss_function" value="Automatic"/>
                    <parameter key="distribution_function" value="AUTO"/>
                    <parameter key="early_stopping" value="false"/>
                    <parameter key="stopping_rounds" value="1"/>
                    <parameter key="stopping_metric" value="AUTO"/>
                    <parameter key="stopping_tolerance" value="0.001"/>
                    <parameter key="missing_values_handling" value="MeanImputation"/>
                    <parameter key="max_runtime_seconds" value="0"/>
                    <list key="expert_parameters"/>
                    <list key="expert_parameters_"/>
                  </operator>
                  <connect from_port="training set 1" to_op="Select by Random (5)" to_port="example set input"/>
                  <connect from_port="training set 2" to_op="Select by Random (6)" to_port="example set input"/>
                  <connect from_port="training set 3" to_op="Select by Random (7)" to_port="example set input"/>
                  <connect from_port="training set 4" to_op="Select by Random (8)" to_port="example set input"/>
                  <connect from_op="Select by Random (5)" from_port="example set output" to_op="Deep Learning (5)" to_port="training set"/>
                  <connect from_op="Deep Learning (5)" from_port="model" to_port="base model 1"/>
                  <connect from_op="Select by Random (6)" from_port="example set output" to_op="Deep Learning (6)" to_port="training set"/>
                  <connect from_op="Deep Learning (6)" from_port="model" to_port="base model 2"/>
                  <connect from_op="Select by Random (7)" from_port="example set output" to_op="Deep Learning (7)" to_port="training set"/>
                  <connect from_op="Deep Learning (7)" from_port="model" to_port="base model 3"/>
                  <connect from_op="Select by Random (8)" from_port="example set output" to_op="Deep Learning (8)" to_port="training set"/>
                  <connect from_op="Deep Learning (8)" from_port="model" to_port="base model 4"/>
                  <portSpacing port="source_training set 1" spacing="0"/>
                  <portSpacing port="source_training set 2" spacing="0"/>
                  <portSpacing port="source_training set 3" spacing="0"/>
                  <portSpacing port="source_training set 4" spacing="0"/>
                  <portSpacing port="source_training set 5" spacing="0"/>
                  <portSpacing port="sink_base model 1" spacing="0"/>
                  <portSpacing port="sink_base model 2" spacing="0"/>
                  <portSpacing port="sink_base model 3" spacing="0"/>
                  <portSpacing port="sink_base model 4" spacing="0"/>
                  <portSpacing port="sink_base model 5" spacing="0"/>
                </process>
                <process expanded="true">
                  <operator activated="true" class="h2o:generalized_linear_model" compatibility="9.9.000" expanded="true" height="124" name="Generalized Linear Model" width="90" x="179" y="34">
                    <parameter key="family" value="AUTO"/>
                    <parameter key="link" value="family_default"/>
                    <parameter key="solver" value="AUTO"/>
                    <parameter key="reproducible" value="false"/>
                    <parameter key="maximum_number_of_threads" value="4"/>
                    <parameter key="use_regularization" value="true"/>
                    <parameter key="lambda_search" value="false"/>
                    <parameter key="number_of_lambdas" value="0"/>
                    <parameter key="lambda_min_ratio" value="0.0"/>
                    <parameter key="early_stopping" value="true"/>
                    <parameter key="stopping_rounds" value="3"/>
                    <parameter key="stopping_tolerance" value="0.001"/>
                    <parameter key="standardize" value="true"/>
                    <parameter key="non-negative_coefficients" value="false"/>
                    <parameter key="add_intercept" value="true"/>
                    <parameter key="compute_p-values" value="false"/>
                    <parameter key="remove_collinear_columns" value="false"/>
                    <parameter key="missing_values_handling" value="MeanImputation"/>
                    <parameter key="max_iterations" value="0"/>
                    <parameter key="specify_beta_constraints" value="false"/>
                    <list key="beta_constraints"/>
                    <parameter key="max_runtime_seconds" value="0"/>
                    <list key="expert_parameters"/>
                  </operator>
                  <connect from_port="stacking examples" to_op="Generalized Linear Model" to_port="training set"/>
                  <connect from_op="Generalized Linear Model" from_port="model" to_port="stacking model"/>
                  <portSpacing port="source_stacking examples" spacing="0"/>
                  <portSpacing port="sink_stacking model" spacing="0"/>
                </process>
              </operator>
              <connect from_port="training" to_op="Stacking" to_port="training set"/>
              <connect from_op="Stacking" from_port="model" to_port="model"/>
              <portSpacing port="source_training" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true">
              <operator activated="true" class="apply_model" compatibility="9.9.000" expanded="true" height="82" name="Apply Model (2)" origin="GENERATED_TUTORIAL" width="90" x="45" y="34">
                <list key="application_parameters"/>
                <parameter key="create_view" value="false"/>
              </operator>
              <operator activated="true" class="performance" compatibility="9.9.000" expanded="true" height="82" name="Performance (2)" origin="GENERATED_TUTORIAL" width="90" x="179" y="34">
                <parameter key="use_example_weights" value="true"/>
              </operator>
              <connect from_port="model" to_op="Apply Model (2)" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model (2)" to_port="unlabelled data"/>
              <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Performance (2)" to_port="labelled data"/>
              <connect from_op="Performance (2)" from_port="performance" to_port="averagable 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_averagable 1" spacing="0"/>
              <portSpacing port="sink_averagable 2" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Sonar" from_port="output" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Validation" to_port="training"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Validation (2)" to_port="training"/>
          <connect from_op="Validation" from_port="model" to_port="result 1"/>
          <connect from_op="Validation" from_port="averagable 1" to_port="result 2"/>
          <connect from_op="Validation (2)" from_port="model" to_port="result 3"/>
          <connect from_op="Validation (2)" from_port="averagable 1" to_port="result 4"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="42"/>
          <portSpacing port="sink_result 3" spacing="66"/>
          <portSpacing port="sink_result 4" spacing="0"/>
          <portSpacing port="sink_result 5" spacing="0"/>
        </process>
      </operator>
    </process>



    2) k-dim decision tree is currently not supported as code-free option. But you can integrate the python snippets into rapidminer workflow
    Credit to https://en.wikipedia.org/wiki/K-d_tree




    HTH!

    YY


    ceaperez
Sign In or Register to comment.