"min_rows error; Gradient Boosted Tree"

mansour_ebrahimmansour_ebrahim Member Posts: 22 Contributor II
edited June 2019 in Help
Dear All
I am trying to run a gradient boosted tree operator on my data with 3 categories as label feature. It keeps trying posting me the following error:
The 'min_rows' parameter must be smaller than or equal to the (size of the input data) *2.
I have tried different numbers from 0 to 100 but it didn't work. 
Appreciate your helpful thoughts.
Regards.
Mansour
Tagged:

Answers

  • varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    Hello @mansour_ebrahim

    Can you post your XML code (View--> Show Panel --> XML) here and data if possible? We will can check it. 

    Thanks
    Varun
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

  • mansour_ebrahimmansour_ebrahim Member Posts: 22 Contributor II
    Hi Varun
    XML codes are:
    <?xml version="1.0" encoding="UTF-8"?><process version="9.2.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.2.000" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="retrieve" compatibility="9.2.000" expanded="true" height="68" name="Retrieve Hot original dataset" width="90" x="45" y="34">
            <parameter key="repository_entry" value="//MEQ/lamb_2019-03-08/uncalibrated/unsmoothed/unscaled/raw/median/hot/Hot original dataset"/>
          </operator>
          <operator activated="true" class="select_attributes" compatibility="9.2.000" expanded="true" height="82" name="Select Attributes" width="90" x="246" y="34">
            <parameter key="attribute_filter_type" value="subset"/>
            <parameter key="attribute" value=""/>
            <parameter key="attributes" value="2 categories|ABATTOIR|carcass_weight|dam_breed|DATE|grade_market|imf|integration_time|ph|seasonality|sex|shear_force|sire_breed|UniqueID"/>
            <parameter key="use_except_expression" value="false"/>
            <parameter key="value_type" value="attribute_value"/>
            <parameter key="use_value_type_exception" value="false"/>
            <parameter key="except_value_type" value="time"/>
            <parameter key="block_type" value="attribute_block"/>
            <parameter key="use_block_type_exception" value="false"/>
            <parameter key="except_block_type" value="value_matrix_row_start"/>
            <parameter key="invert_selection" value="true"/>
            <parameter key="include_special_attributes" value="false"/>
          </operator>
          <operator activated="true" class="set_role" compatibility="9.2.000" expanded="true" height="82" name="Set Role" width="90" x="447" y="34">
            <parameter key="attribute_name" value="3 categories"/>
            <parameter key="target_role" value="label"/>
            <list key="set_additional_roles"/>
          </operator>
          <operator activated="true" class="concurrency:cross_validation" compatibility="8.2.000" expanded="true" height="166" name="Gradient Boosted Trees (2)" width="90" x="648" y="34">
            <parameter key="split_on_batch_attribute" value="false"/>
            <parameter key="leave_one_out" value="false"/>
            <parameter key="number_of_folds" value="10"/>
            <parameter key="sampling_type" value="automatic"/>
            <parameter key="use_local_random_seed" value="false"/>
            <parameter key="local_random_seed" value="1992"/>
            <parameter key="enable_parallel_execution" value="true"/>
            <process expanded="true">
              <operator activated="true" class="h2o:gradient_boosted_trees" compatibility="9.2.000" expanded="true" height="103" name="Gradient Boosted Trees" width="90" x="179" y="34">
                <parameter key="number_of_trees" value="100"/>
                <parameter key="reproducible" value="false"/>
                <parameter key="maximum_number_of_threads" value="4"/>
                <parameter key="use_local_random_seed" value="false"/>
                <parameter key="local_random_seed" value="1992"/>
                <parameter key="maximal_depth" value="10"/>
                <parameter key="min_rows" value="1.0"/>
                <parameter key="min_split_improvement" value="0.0"/>
                <parameter key="number_of_bins" value="20"/>
                <parameter key="learning_rate" value="0.01"/>
                <parameter key="sample_rate" value="1.0"/>
                <parameter key="distribution" value="AUTO"/>
                <parameter key="early_stopping" value="false"/>
                <parameter key="stopping_rounds" value="1"/>
                <parameter key="stopping_metric" value="AUTO"/>
                <parameter key="stopping_tolerance" value="0.001"/>
                <parameter key="max_runtime_seconds" value="0"/>
                <list key="expert_parameters"/>
              </operator>
              <connect from_port="training set" to_op="Gradient Boosted Trees" to_port="training set"/>
              <connect from_op="Gradient Boosted Trees" from_port="model" to_port="model"/>
              <portSpacing port="source_training set" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true">
              <operator activated="true" class="apply_model" compatibility="7.1.001" expanded="true" height="82" name="Apply Model (7)" width="90" x="45" y="30">
                <list key="application_parameters"/>
                <parameter key="create_view" value="false"/>
              </operator>
              <operator activated="true" class="multiply" compatibility="9.2.000" expanded="true" height="103" name="Multiply" width="90" x="45" y="136"/>
              <operator activated="true" class="performance" compatibility="9.2.000" expanded="true" height="82" name="Performance (7)" width="90" x="313" y="34">
                <parameter key="use_example_weights" value="true"/>
              </operator>
              <operator activated="true" class="performance_classification" compatibility="9.2.000" expanded="true" height="82" name="Performance" width="90" x="246" y="187">
                <parameter key="main_criterion" value="first"/>
                <parameter key="accuracy" value="true"/>
                <parameter key="classification_error" value="true"/>
                <parameter key="kappa" value="true"/>
                <parameter key="weighted_mean_recall" value="true"/>
                <parameter key="weighted_mean_precision" value="true"/>
                <parameter key="spearman_rho" value="true"/>
                <parameter key="kendall_tau" value="true"/>
                <parameter key="absolute_error" value="true"/>
                <parameter key="relative_error" value="true"/>
                <parameter key="relative_error_lenient" value="true"/>
                <parameter key="relative_error_strict" value="false"/>
                <parameter key="normalized_absolute_error" value="true"/>
                <parameter key="root_mean_squared_error" value="true"/>
                <parameter key="root_relative_squared_error" value="true"/>
                <parameter key="squared_error" value="true"/>
                <parameter key="correlation" value="true"/>
                <parameter key="squared_correlation" value="true"/>
                <parameter key="cross-entropy" value="true"/>
                <parameter key="margin" value="true"/>
                <parameter key="soft_margin_loss" value="true"/>
                <parameter key="logistic_loss" value="true"/>
                <parameter key="skip_undefined_labels" value="true"/>
                <parameter key="use_example_weights" value="true"/>
                <list key="class_weights"/>
              </operator>
              <connect from_port="model" to_op="Apply Model (7)" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model (7)" to_port="unlabelled data"/>
              <connect from_op="Apply Model (7)" from_port="labelled data" to_op="Multiply" to_port="input"/>
              <connect from_op="Multiply" from_port="output 1" to_op="Performance (7)" to_port="labelled data"/>
              <connect from_op="Multiply" from_port="output 2" to_op="Performance" to_port="labelled data"/>
              <connect from_op="Performance (7)" from_port="performance" to_port="performance 1"/>
              <connect from_op="Performance" from_port="performance" to_port="performance 2"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_test set results" spacing="0"/>
              <portSpacing port="sink_performance 1" spacing="0"/>
              <portSpacing port="sink_performance 2" spacing="0"/>
              <portSpacing port="sink_performance 3" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Retrieve Hot original dataset" from_port="output" to_op="Select Attributes" to_port="example set input"/>
          <connect from_op="Select Attributes" from_port="example set output" to_op="Set Role" to_port="example set input"/>
          <connect from_op="Set Role" from_port="example set output" to_op="Gradient Boosted Trees (2)" to_port="example set"/>
          <connect from_op="Gradient Boosted Trees (2)" from_port="model" to_port="result 1"/>
          <connect from_op="Gradient Boosted Trees (2)" from_port="performance 1" to_port="result 2"/>
          <connect from_op="Gradient Boosted Trees (2)" from_port="performance 2" to_port="result 3"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
          <portSpacing port="sink_result 4" spacing="0"/>
        </process>
      </operator>
    </process>

  • varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    Hello @mansour_ebrahim

    I tried recreating error with IRIS dataset (with your process) but it worked fine. Can you provide your data here or on personal message? If we can recreate the error then we can understand the issue.

    Thank you

    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

Sign In or Register to comment.