Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Prediction for a model

induindu Member Posts: 9 Learner I
I checked your two datasets X and Y, both have "Class" attribute. This class has labels "0" and "1". I want to combine these two datasets and make predictions as a single model. Can someone share the steps for this task.  

Best Answer

  • induindu Member Posts: 9 Learner I
    Solution Accepted
    Ok i Got the results...thank you very much for the reply.

Answers

  • lionelderkrikorlionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Hi @indu,

    mmhh...  I don't understand what you want to do...
    In order we can help you, can you describe what you have and what you want to obtain.
    Please share your data...

    Regards,

    Lionel
  • induindu Member Posts: 9 Learner I
    I want to combine these two datasets and make predictions as a single model. Please advise. 
     
  • lionelderkrikorlionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    @indu,
    It's very easy : 

    Use an Append operator to combine the 2 datasets.

    Here a sample process with your 2 combined datasets including a X-validation with a Decision Tree : 

    <?xml version="1.0" encoding="UTF-8"?><process version="9.5.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.5.000" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="read_csv" compatibility="9.5.000" expanded="true" height="68" name="Read CSV" width="90" x="112" y="85">
            <parameter key="csv_file" value="C:\Users\Lionel\Downloads\aa_counts_dna.csv"/>
            <parameter key="column_separators" value=","/>
            <parameter key="trim_lines" value="false"/>
            <parameter key="use_quotes" value="true"/>
            <parameter key="quotes_character" value="&quot;"/>
            <parameter key="escape_character" value="\"/>
            <parameter key="skip_comments" value="true"/>
            <parameter key="comment_characters" value="#"/>
            <parameter key="starting_row" value="1"/>
            <parameter key="parse_numbers" value="true"/>
            <parameter key="decimal_character" value="."/>
            <parameter key="grouped_digits" value="false"/>
            <parameter key="grouping_character" value=","/>
            <parameter key="infinity_representation" value=""/>
            <parameter key="date_format" value=""/>
            <parameter key="first_row_as_names" value="true"/>
            <list key="annotations"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="locale" value="English (United States)"/>
            <parameter key="encoding" value="windows-1252"/>
            <parameter key="read_all_values_as_polynominal" value="false"/>
            <list key="data_set_meta_data_information">
              <parameter key="0" value="G.true.real.attribute"/>
              <parameter key="1" value="P.true.real.attribute"/>
              <parameter key="2" value="A.true.real.attribute"/>
              <parameter key="3" value="V.true.real.attribute"/>
              <parameter key="4" value="L.true.real.attribute"/>
              <parameter key="5" value="I.true.real.attribute"/>
              <parameter key="6" value="M.true.real.attribute"/>
              <parameter key="7" value="C.true.real.attribute"/>
              <parameter key="8" value="F.true.real.attribute"/>
              <parameter key="9" value="Y.true.real.attribute"/>
              <parameter key="10" value="W.true.real.attribute"/>
              <parameter key="11" value="H.true.real.attribute"/>
              <parameter key="12" value="K.true.real.attribute"/>
              <parameter key="13" value="R.true.real.attribute"/>
              <parameter key="14" value="Q.true.real.attribute"/>
              <parameter key="15" value="N.true.real.attribute"/>
              <parameter key="16" value="E.true.real.attribute"/>
              <parameter key="17" value="D.true.real.attribute"/>
              <parameter key="18" value="S.true.real.attribute"/>
              <parameter key="19" value="T.true.real.attribute"/>
              <parameter key="20" value="Class.true.integer.attribute"/>
            </list>
            <parameter key="read_not_matching_values_as_missings" value="false"/>
            <parameter key="datamanagement" value="double_array"/>
            <parameter key="data_management" value="auto"/>
          </operator>
          <operator activated="true" class="read_csv" compatibility="9.5.000" expanded="true" height="68" name="Read CSV (2)" width="90" x="112" y="238">
            <parameter key="csv_file" value="C:\Users\Lionel\Downloads\aa_counts_rna.csv"/>
            <parameter key="column_separators" value=","/>
            <parameter key="trim_lines" value="false"/>
            <parameter key="use_quotes" value="true"/>
            <parameter key="quotes_character" value="&quot;"/>
            <parameter key="escape_character" value="\"/>
            <parameter key="skip_comments" value="true"/>
            <parameter key="comment_characters" value="#"/>
            <parameter key="starting_row" value="1"/>
            <parameter key="parse_numbers" value="true"/>
            <parameter key="decimal_character" value="."/>
            <parameter key="grouped_digits" value="false"/>
            <parameter key="grouping_character" value=","/>
            <parameter key="infinity_representation" value=""/>
            <parameter key="date_format" value=""/>
            <parameter key="first_row_as_names" value="true"/>
            <list key="annotations"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="locale" value="English (United States)"/>
            <parameter key="encoding" value="windows-1252"/>
            <parameter key="read_all_values_as_polynominal" value="false"/>
            <list key="data_set_meta_data_information">
              <parameter key="0" value="G.true.real.attribute"/>
              <parameter key="1" value="P.true.real.attribute"/>
              <parameter key="2" value="A.true.real.attribute"/>
              <parameter key="3" value="V.true.real.attribute"/>
              <parameter key="4" value="L.true.real.attribute"/>
              <parameter key="5" value="I.true.real.attribute"/>
              <parameter key="6" value="M.true.real.attribute"/>
              <parameter key="7" value="C.true.real.attribute"/>
              <parameter key="8" value="F.true.real.attribute"/>
              <parameter key="9" value="Y.true.real.attribute"/>
              <parameter key="10" value="W.true.real.attribute"/>
              <parameter key="11" value="H.true.real.attribute"/>
              <parameter key="12" value="K.true.real.attribute"/>
              <parameter key="13" value="R.true.real.attribute"/>
              <parameter key="14" value="Q.true.real.attribute"/>
              <parameter key="15" value="N.true.real.attribute"/>
              <parameter key="16" value="E.true.real.attribute"/>
              <parameter key="17" value="D.true.real.attribute"/>
              <parameter key="18" value="S.true.real.attribute"/>
              <parameter key="19" value="T.true.real.attribute"/>
              <parameter key="20" value="Class.true.integer.attribute"/>
            </list>
            <parameter key="read_not_matching_values_as_missings" value="false"/>
            <parameter key="datamanagement" value="double_array"/>
            <parameter key="data_management" value="auto"/>
          </operator>
          <operator activated="true" class="append" compatibility="9.5.000" expanded="true" height="103" name="Append" width="90" x="313" y="85">
            <parameter key="datamanagement" value="double_array"/>
            <parameter key="data_management" value="auto"/>
            <parameter key="merge_type" value="all"/>
          </operator>
          <operator activated="true" class="set_role" compatibility="9.5.000" expanded="true" height="82" name="Set Role" width="90" x="447" y="85">
            <parameter key="attribute_name" value="Class"/>
            <parameter key="target_role" value="label"/>
            <list key="set_additional_roles"/>
          </operator>
          <operator activated="true" class="numerical_to_polynominal" compatibility="9.5.000" expanded="true" height="82" name="Numerical to Polynominal" width="90" x="581" y="85">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="Class"/>
            <parameter key="attributes" value=""/>
            <parameter key="use_except_expression" value="false"/>
            <parameter key="value_type" value="numeric"/>
            <parameter key="use_value_type_exception" value="false"/>
            <parameter key="except_value_type" value="real"/>
            <parameter key="block_type" value="value_series"/>
            <parameter key="use_block_type_exception" value="false"/>
            <parameter key="except_block_type" value="value_series_end"/>
            <parameter key="invert_selection" value="false"/>
            <parameter key="include_special_attributes" value="true"/>
          </operator>
          <operator activated="true" class="concurrency:cross_validation" compatibility="9.5.000" expanded="true" height="145" name="Cross Validation" width="90" x="715" y="85">
            <parameter key="split_on_batch_attribute" value="false"/>
            <parameter key="leave_one_out" value="false"/>
            <parameter key="number_of_folds" value="10"/>
            <parameter key="sampling_type" value="automatic"/>
            <parameter key="use_local_random_seed" value="false"/>
            <parameter key="local_random_seed" value="1992"/>
            <parameter key="enable_parallel_execution" value="true"/>
            <process expanded="true">
              <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="9.5.000" expanded="true" height="103" name="Decision Tree" width="90" x="112" y="34">
                <parameter key="criterion" value="gain_ratio"/>
                <parameter key="maximal_depth" value="10"/>
                <parameter key="apply_pruning" value="true"/>
                <parameter key="confidence" value="0.1"/>
                <parameter key="apply_prepruning" value="true"/>
                <parameter key="minimal_gain" value="0.01"/>
                <parameter key="minimal_leaf_size" value="2"/>
                <parameter key="minimal_size_for_split" value="4"/>
                <parameter key="number_of_prepruning_alternatives" value="3"/>
              </operator>
              <connect from_port="training set" to_op="Decision Tree" to_port="training set"/>
              <connect from_op="Decision Tree" from_port="model" to_port="model"/>
              <portSpacing port="source_training set" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true">
              <operator activated="true" class="apply_model" compatibility="9.5.000" expanded="true" height="82" name="Apply Model" width="90" x="45" y="34">
                <list key="application_parameters"/>
                <parameter key="create_view" value="false"/>
              </operator>
              <operator activated="true" class="performance_binominal_classification" compatibility="9.5.000" expanded="true" height="82" name="Performance" width="90" x="179" y="34">
                <parameter key="manually_set_positive_class" value="false"/>
                <parameter key="main_criterion" value="first"/>
                <parameter key="accuracy" value="true"/>
                <parameter key="classification_error" value="false"/>
                <parameter key="kappa" value="false"/>
                <parameter key="AUC (optimistic)" value="false"/>
                <parameter key="AUC" value="false"/>
                <parameter key="AUC (pessimistic)" value="false"/>
                <parameter key="precision" value="false"/>
                <parameter key="recall" value="false"/>
                <parameter key="lift" value="false"/>
                <parameter key="fallout" value="false"/>
                <parameter key="f_measure" value="false"/>
                <parameter key="false_positive" value="false"/>
                <parameter key="false_negative" value="false"/>
                <parameter key="true_positive" value="false"/>
                <parameter key="true_negative" value="false"/>
                <parameter key="sensitivity" value="false"/>
                <parameter key="specificity" value="false"/>
                <parameter key="youden" value="false"/>
                <parameter key="positive_predictive_value" value="false"/>
                <parameter key="negative_predictive_value" value="false"/>
                <parameter key="psep" value="false"/>
                <parameter key="skip_undefined_labels" value="true"/>
                <parameter key="use_example_weights" value="true"/>
              </operator>
              <connect from_port="model" to_op="Apply Model" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
              <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
              <connect from_op="Performance" from_port="performance" to_port="performance 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_test set results" spacing="0"/>
              <portSpacing port="sink_performance 1" spacing="0"/>
              <portSpacing port="sink_performance 2" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Read CSV" from_port="output" to_op="Append" to_port="example set 1"/>
          <connect from_op="Read CSV (2)" from_port="output" to_op="Append" to_port="example set 2"/>
          <connect from_op="Append" from_port="merged set" to_op="Set Role" to_port="example set input"/>
          <connect from_op="Set Role" from_port="example set output" to_op="Numerical to Polynominal" to_port="example set input"/>
          <connect from_op="Numerical to Polynominal" from_port="example set output" to_op="Cross Validation" to_port="example set"/>
          <connect from_op="Cross Validation" from_port="example set" to_port="result 1"/>
          <connect from_op="Cross Validation" from_port="performance 1" to_port="result 2"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
        </process>
      </operator>
    </process>
    
    Hope this helps,

    Regards,

    Lionel



  • induindu Member Posts: 9 Learner I
    when i split the data, i get some errors. please use the process attached below. 
  • lionelderkrikorlionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    @indu

    You don't set the parameters of the Split Data operators, that's why RM is raising an error.
    I must admit that I don't understand what you want to do with these Split Data operators...

    Please use the working process I shared in my last post.

    Regards,

    Lionel
Sign In or Register to comment.