Compare predicted results from deep learning to actual in the validation set

bsegalbsegal Member Posts: 7 Contributor II
edited November 2018 in Help

I am a beginner so I apologize in advance if this is obvious, but the online chat folks suggested I post here!

 

I am trying to train a deep neural network to make a binary prediction ("hard" vs "easy") based on a bunch of real number parameters and a couple of nominal parameters. I input the data from excel for the labelled training set and put a set role block to indicate the "answer" called "class" as a label. Then I passed the data to the deep learning block. I took the trained model and used a apply model block, giving an unlabelled validation set of data as the input. Wired both outputs to the results on the far right. What I get is the assigned predictions in a new column ("Prediction(class)" where "class" was the label). What I need to do now is see how well it did by comparing the actual to the prediction. Because the validation set is unlabeled, it's not present in that excel. I have them of course, in the original data, but I had removed them to make the validation set unlabeled. So basically I want to evaluate the performance of the prediction.

 

My wiring and output data are appended.

 

Thanks so much!

 

 

Best Answer

  • bsegalbsegal Posts: 7 Contributor II
    Solution Accepted

    OK thanks, i will run these for now.  We do have a bunch more data, though it's not "enriched" in difficult (vs easy) cases like these original sets, which were derived after the fact to yield exactly 50/50.  The new data set is prospective and has only ~10% difficult but does have several hundred rows and growing.  I'll likely be back for help with the DL!

Answers

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member Posts: 1,968  Community Manager

    hello @bsegal welcome to the community!  I'd recommend posting your XML process here (see "Read Before Posting" on right when you reply) and attach your dataset. This way we can replicate what you're doing and help you better.

     

    Scott

     

     

     

     

  • bsegalbsegal Member Posts: 7 Contributor II

    Thanks.  Enclosed is the xml and the excel file with the data.  The labelled training set is tab 2, and the unlabelled validation set is tab 3.  All of the data together is on tab 1.

    <?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
    <parameter key="logverbosity" value="init"/>
    <parameter key="random_seed" value="2001"/>
    <parameter key="send_mail" value="never"/>
    <parameter key="notification_email" value=""/>
    <parameter key="process_duration_for_mail" value="30"/>
    <parameter key="encoding" value="SYSTEM"/>
    <process expanded="true">
    <operator activated="true" class="read_excel" compatibility="8.0.001" expanded="true" height="68" name="Read Excel" width="90" x="380" y="340">
    <parameter key="excel_file" value="/Users/scottsegalmd/Documents/AW computer study/Deep learning/openface embeds with demographics.xlsx"/>
    <parameter key="sheet_number" value="3"/>
    <parameter key="imported_cell_range" value="A1:ED41"/>
    <parameter key="encoding" value="SYSTEM"/>
    <parameter key="first_row_as_names" value="false"/>
    <list key="annotations">
    <parameter key="0" value="Name"/>
    </list>
    <parameter key="date_format" value=""/>
    <parameter key="time_zone" value="SYSTEM"/>
    <parameter key="locale" value="English (United States)"/>
    <list key="data_set_meta_data_information">
    <parameter key="0" value="Embed 1.true.real.attribute"/>
    <parameter key="1" value="Embed 2.true.real.attribute"/>
    <parameter key="2" value="Embed 3.true.real.attribute"/>
    <parameter key="3" value="Embed 4.true.real.attribute"/>
    <parameter key="4" value="Embed 5.true.real.attribute"/>
    <parameter key="5" value="Embed 6.true.real.attribute"/>
    <parameter key="6" value="Embed 7.true.real.attribute"/>
    <parameter key="7" value="Embed 8.true.real.attribute"/>
    <parameter key="8" value="Embed 9.true.real.attribute"/>
    <parameter key="9" value="Embed 10.true.real.attribute"/>
    <parameter key="10" value="Embed 11.true.real.attribute"/>
    <parameter key="11" value="Embed 12.true.real.attribute"/>
    <parameter key="12" value="Embed 13.true.real.attribute"/>
    <parameter key="13" value="Embed 14.true.real.attribute"/>
    <parameter key="14" value="Embed 15.true.real.attribute"/>
    <parameter key="15" value="Embed 16.true.real.attribute"/>
    <parameter key="16" value="Embed 17.true.real.attribute"/>
    <parameter key="17" value="Embed 18.true.real.attribute"/>
    <parameter key="18" value="Embed 19.true.real.attribute"/>
    <parameter key="19" value="Embed 20.true.real.attribute"/>
    <parameter key="20" value="Embed 21.true.real.attribute"/>
    <parameter key="21" value="Embed 22.true.real.attribute"/>
    <parameter key="22" value="Embed 23.true.real.attribute"/>
    <parameter key="23" value="Embed 24.true.real.attribute"/>
    <parameter key="24" value="Embed 25.true.real.attribute"/>
    <parameter key="25" value="Embed 26.true.real.attribute"/>
    <parameter key="26" value="Embed 27.true.real.attribute"/>
    <parameter key="27" value="Embed 28.true.real.attribute"/>
    <parameter key="28" value="Embed 29.true.real.attribute"/>
    <parameter key="29" value="Embed 30.true.real.attribute"/>
    <parameter key="30" value="Embed 31.true.real.attribute"/>
    <parameter key="31" value="Embed 32.true.real.attribute"/>
    <parameter key="32" value="Embed 33.true.real.attribute"/>
    <parameter key="33" value="Embed 34.true.real.attribute"/>
    <parameter key="34" value="Embed 35.true.real.attribute"/>
    <parameter key="35" value="Embed 36.true.real.attribute"/>
    <parameter key="36" value="Embed 37.true.real.attribute"/>
    <parameter key="37" value="Embed 38.true.real.attribute"/>
    <parameter key="38" value="Embed 39.true.real.attribute"/>
    <parameter key="39" value="Embed 40.true.real.attribute"/>
    <parameter key="40" value="Embed 41.true.real.attribute"/>
    <parameter key="41" value="Embed 42.true.real.attribute"/>
    <parameter key="42" value="Embed 43.true.real.attribute"/>
    <parameter key="43" value="Embed 44.true.real.attribute"/>
    <parameter key="44" value="Embed 45.true.real.attribute"/>
    <parameter key="45" value="Embed 46.true.real.attribute"/>
    <parameter key="46" value="Embed 47.true.real.attribute"/>
    <parameter key="47" value="Embed 48.true.real.attribute"/>
    <parameter key="48" value="Embed 49.true.real.attribute"/>
    <parameter key="49" value="Embed 50.true.real.attribute"/>
    <parameter key="50" value="Embed 51.true.real.attribute"/>
    <parameter key="51" value="Embed 52.true.real.attribute"/>
    <parameter key="52" value="Embed 53.true.real.attribute"/>
    <parameter key="53" value="Embed 54.true.real.attribute"/>
    <parameter key="54" value="Embed 55.true.real.attribute"/>
    <parameter key="55" value="Embed 56.true.real.attribute"/>
    <parameter key="56" value="Embed 57.true.real.attribute"/>
    <parameter key="57" value="Embed 58.true.real.attribute"/>
    <parameter key="58" value="Embed 59.true.real.attribute"/>
    <parameter key="59" value="Embed 60.true.real.attribute"/>
    <parameter key="60" value="Embed 61.true.real.attribute"/>
    <parameter key="61" value="Embed 62.true.real.attribute"/>
    <parameter key="62" value="Embed 63.true.real.attribute"/>
    <parameter key="63" value="Embed 64.true.real.attribute"/>
    <parameter key="64" value="Embed 65.true.real.attribute"/>
    <parameter key="65" value="Embed 66.true.real.attribute"/>
    <parameter key="66" value="Embed 67.true.real.attribute"/>
    <parameter key="67" value="Embed 68.true.real.attribute"/>
    <parameter key="68" value="Embed 69.true.real.attribute"/>
    <parameter key="69" value="Embed 70.true.real.attribute"/>
    <parameter key="70" value="Embed 71.true.real.attribute"/>
    <parameter key="71" value="Embed 72.true.real.attribute"/>
    <parameter key="72" value="Embed 73.true.real.attribute"/>
    <parameter key="73" value="Embed 74.true.real.attribute"/>
    <parameter key="74" value="Embed 75.true.real.attribute"/>
    <parameter key="75" value="Embed 76.true.real.attribute"/>
    <parameter key="76" value="Embed 77.true.real.attribute"/>
    <parameter key="77" value="Embed 78.true.real.attribute"/>
    <parameter key="78" value="Embed 79.true.real.attribute"/>
    <parameter key="79" value="Embed 80.true.real.attribute"/>
    <parameter key="80" value="Embed 81.true.real.attribute"/>
    <parameter key="81" value="Embed 82.true.real.attribute"/>
    <parameter key="82" value="Embed 83.true.real.attribute"/>
    <parameter key="83" value="Embed 84.true.real.attribute"/>
    <parameter key="84" value="Embed 85.true.real.attribute"/>
    <parameter key="85" value="Embed 86.true.real.attribute"/>
    <parameter key="86" value="Embed 87.true.real.attribute"/>
    <parameter key="87" value="Embed 88.true.real.attribute"/>
    <parameter key="88" value="Embed 89.true.real.attribute"/>
    <parameter key="89" value="Embed 90.true.real.attribute"/>
    <parameter key="90" value="Embed 91.true.real.attribute"/>
    <parameter key="91" value="Embed 92.true.real.attribute"/>
    <parameter key="92" value="Embed 93.true.real.attribute"/>
    <parameter key="93" value="Embed 94.true.real.attribute"/>
    <parameter key="94" value="Embed 95.true.real.attribute"/>
    <parameter key="95" value="Embed 96.true.real.attribute"/>
    <parameter key="96" value="Embed 97.true.real.attribute"/>
    <parameter key="97" value="Embed 98.true.real.attribute"/>
    <parameter key="98" value="Embed 99.true.real.attribute"/>
    <parameter key="99" value="Embed 100.true.real.attribute"/>
    <parameter key="100" value="Embed 101.true.real.attribute"/>
    <parameter key="101" value="Embed 102.true.real.attribute"/>
    <parameter key="102" value="Embed 103.true.real.attribute"/>
    <parameter key="103" value="Embed 104.true.real.attribute"/>
    <parameter key="104" value="Embed 105.true.real.attribute"/>
    <parameter key="105" value="Embed 106.true.real.attribute"/>
    <parameter key="106" value="Embed 107.true.real.attribute"/>
    <parameter key="107" value="Embed 108.true.real.attribute"/>
    <parameter key="108" value="Embed 109.true.real.attribute"/>
    <parameter key="109" value="Embed 110.true.real.attribute"/>
    <parameter key="110" value="Embed 111.true.real.attribute"/>
    <parameter key="111" value="Embed 112.true.real.attribute"/>
    <parameter key="112" value="Embed 113.true.real.attribute"/>
    <parameter key="113" value="Embed 114.true.real.attribute"/>
    <parameter key="114" value="Embed 115.true.real.attribute"/>
    <parameter key="115" value="Embed 116.true.real.attribute"/>
    <parameter key="116" value="Embed 117.true.real.attribute"/>
    <parameter key="117" value="Embed 118.true.real.attribute"/>
    <parameter key="118" value="Embed 119.true.real.attribute"/>
    <parameter key="119" value="Embed 120.true.real.attribute"/>
    <parameter key="120" value="Embed 121.true.real.attribute"/>
    <parameter key="121" value="Embed 122.true.real.attribute"/>
    <parameter key="122" value="Embed 123.true.real.attribute"/>
    <parameter key="123" value="Embed 124.true.real.attribute"/>
    <parameter key="124" value="Embed 125.true.real.attribute"/>
    <parameter key="125" value="Embed 126.true.real.attribute"/>
    <parameter key="126" value="Embed 127.true.real.attribute"/>
    <parameter key="127" value="Embed 128.true.real.attribute"/>
    <parameter key="128" value="Age.true.integer.attribute"/>
    <parameter key="129" value="Height.true.integer.attribute"/>
    <parameter key="130" value="Weight.true.integer.attribute"/>
    <parameter key="131" value="BMI.true.numeric.attribute"/>
    <parameter key="132" value="MP.true.nominal.attribute"/>
    <parameter key="133" value="TMD.true.numeric.attribute"/>
    </list>
    <parameter key="read_not_matching_values_as_missings" value="true"/>
    <parameter key="datamanagement" value="double_array"/>
    <parameter key="data_management" value="auto"/>
    </operator>
    <operator activated="true" class="read_excel" compatibility="8.0.001" expanded="true" height="68" name="Read Excel (2)" width="90" x="45" y="136">
    <parameter key="excel_file" value="/Users/scottsegalmd/Documents/AW computer study/Deep learning/openface embeds with demographics.xlsx"/>
    <parameter key="sheet_number" value="2"/>
    <parameter key="imported_cell_range" value="A1:EE41"/>
    <parameter key="encoding" value="SYSTEM"/>
    <parameter key="first_row_as_names" value="false"/>
    <list key="annotations">
    <parameter key="0" value="Name"/>
    </list>
    <parameter key="date_format" value=""/>
    <parameter key="time_zone" value="SYSTEM"/>
    <parameter key="locale" value="English (United States)"/>
    <list key="data_set_meta_data_information">
    <parameter key="0" value="Embed 1.true.real.attribute"/>
    <parameter key="1" value="Embed 2.true.real.attribute"/>
    <parameter key="2" value="Embed 3.true.real.attribute"/>
    <parameter key="3" value="Embed 4.true.real.attribute"/>
    <parameter key="4" value="Embed 5.true.real.attribute"/>
    <parameter key="5" value="Embed 6.true.real.attribute"/>
    <parameter key="6" value="Embed 7.true.real.attribute"/>
    <parameter key="7" value="Embed 8.true.real.attribute"/>
    <parameter key="8" value="Embed 9.true.real.attribute"/>
    <parameter key="9" value="Embed 10.true.real.attribute"/>
    <parameter key="10" value="Embed 11.true.real.attribute"/>
    <parameter key="11" value="Embed 12.true.real.attribute"/>
    <parameter key="12" value="Embed 13.true.real.attribute"/>
    <parameter key="13" value="Embed 14.true.real.attribute"/>
    <parameter key="14" value="Embed 15.true.real.attribute"/>
    <parameter key="15" value="Embed 16.true.real.attribute"/>
    <parameter key="16" value="Embed 17.true.real.attribute"/>
    <parameter key="17" value="Embed 18.true.real.attribute"/>
    <parameter key="18" value="Embed 19.true.real.attribute"/>
    <parameter key="19" value="Embed 20.true.real.attribute"/>
    <parameter key="20" value="Embed 21.true.real.attribute"/>
    <parameter key="21" value="Embed 22.true.real.attribute"/>
    <parameter key="22" value="Embed 23.true.real.attribute"/>
    <parameter key="23" value="Embed 24.true.real.attribute"/>
    <parameter key="24" value="Embed 25.true.real.attribute"/>
    <parameter key="25" value="Embed 26.true.real.attribute"/>
    <parameter key="26" value="Embed 27.true.real.attribute"/>
    <parameter key="27" value="Embed 28.true.real.attribute"/>
    <parameter key="28" value="Embed 29.true.real.attribute"/>
    <parameter key="29" value="Embed 30.true.real.attribute"/>
    <parameter key="30" value="Embed 31.true.real.attribute"/>
    <parameter key="31" value="Embed 32.true.real.attribute"/>
    <parameter key="32" value="Embed 33.true.real.attribute"/>
    <parameter key="33" value="Embed 34.true.real.attribute"/>
    <parameter key="34" value="Embed 35.true.real.attribute"/>
    <parameter key="35" value="Embed 36.true.real.attribute"/>
    <parameter key="36" value="Embed 37.true.real.attribute"/>
    <parameter key="37" value="Embed 38.true.real.attribute"/>
    <parameter key="38" value="Embed 39.true.real.attribute"/>
    <parameter key="39" value="Embed 40.true.real.attribute"/>
    <parameter key="40" value="Embed 41.true.real.attribute"/>
    <parameter key="41" value="Embed 42.true.real.attribute"/>
    <parameter key="42" value="Embed 43.true.real.attribute"/>
    <parameter key="43" value="Embed 44.true.real.attribute"/>
    <parameter key="44" value="Embed 45.true.real.attribute"/>
    <parameter key="45" value="Embed 46.true.real.attribute"/>
    <parameter key="46" value="Embed 47.true.real.attribute"/>
    <parameter key="47" value="Embed 48.true.real.attribute"/>
    <parameter key="48" value="Embed 49.true.real.attribute"/>
    <parameter key="49" value="Embed 50.true.real.attribute"/>
    <parameter key="50" value="Embed 51.true.real.attribute"/>
    <parameter key="51" value="Embed 52.true.real.attribute"/>
    <parameter key="52" value="Embed 53.true.real.attribute"/>
    <parameter key="53" value="Embed 54.true.real.attribute"/>
    <parameter key="54" value="Embed 55.true.real.attribute"/>
    <parameter key="55" value="Embed 56.true.real.attribute"/>
    <parameter key="56" value="Embed 57.true.real.attribute"/>
    <parameter key="57" value="Embed 58.true.real.attribute"/>
    <parameter key="58" value="Embed 59.true.real.attribute"/>
    <parameter key="59" value="Embed 60.true.real.attribute"/>
    <parameter key="60" value="Embed 61.true.real.attribute"/>
    <parameter key="61" value="Embed 62.true.real.attribute"/>
    <parameter key="62" value="Embed 63.true.real.attribute"/>
    <parameter key="63" value="Embed 64.true.real.attribute"/>
    <parameter key="64" value="Embed 65.true.real.attribute"/>
    <parameter key="65" value="Embed 66.true.real.attribute"/>
    <parameter key="66" value="Embed 67.true.real.attribute"/>
    <parameter key="67" value="Embed 68.true.real.attribute"/>
    <parameter key="68" value="Embed 69.true.real.attribute"/>
    <parameter key="69" value="Embed 70.true.real.attribute"/>
    <parameter key="70" value="Embed 71.true.real.attribute"/>
    <parameter key="71" value="Embed 72.true.real.attribute"/>
    <parameter key="72" value="Embed 73.true.real.attribute"/>
    <parameter key="73" value="Embed 74.true.real.attribute"/>
    <parameter key="74" value="Embed 75.true.real.attribute"/>
    <parameter key="75" value="Embed 76.true.real.attribute"/>
    <parameter key="76" value="Embed 77.true.real.attribute"/>
    <parameter key="77" value="Embed 78.true.real.attribute"/>
    <parameter key="78" value="Embed 79.true.real.attribute"/>
    <parameter key="79" value="Embed 80.true.real.attribute"/>
    <parameter key="80" value="Embed 81.true.real.attribute"/>
    <parameter key="81" value="Embed 82.true.real.attribute"/>
    <parameter key="82" value="Embed 83.true.real.attribute"/>
    <parameter key="83" value="Embed 84.true.real.attribute"/>
    <parameter key="84" value="Embed 85.true.real.attribute"/>
    <parameter key="85" value="Embed 86.true.real.attribute"/>
    <parameter key="86" value="Embed 87.true.real.attribute"/>
    <parameter key="87" value="Embed 88.true.real.attribute"/>
    <parameter key="88" value="Embed 89.true.real.attribute"/>
    <parameter key="89" value="Embed 90.true.real.attribute"/>
    <parameter key="90" value="Embed 91.true.real.attribute"/>
    <parameter key="91" value="Embed 92.true.real.attribute"/>
    <parameter key="92" value="Embed 93.true.real.attribute"/>
    <parameter key="93" value="Embed 94.true.real.attribute"/>
    <parameter key="94" value="Embed 95.true.real.attribute"/>
    <parameter key="95" value="Embed 96.true.real.attribute"/>
    <parameter key="96" value="Embed 97.true.real.attribute"/>
    <parameter key="97" value="Embed 98.true.real.attribute"/>
    <parameter key="98" value="Embed 99.true.real.attribute"/>
    <parameter key="99" value="Embed 100.true.real.attribute"/>
    <parameter key="100" value="Embed 101.true.real.attribute"/>
    <parameter key="101" value="Embed 102.true.real.attribute"/>
    <parameter key="102" value="Embed 103.true.real.attribute"/>
    <parameter key="103" value="Embed 104.true.real.attribute"/>
    <parameter key="104" value="Embed 105.true.real.attribute"/>
    <parameter key="105" value="Embed 106.true.real.attribute"/>
    <parameter key="106" value="Embed 107.true.real.attribute"/>
    <parameter key="107" value="Embed 108.true.real.attribute"/>
    <parameter key="108" value="Embed 109.true.real.attribute"/>
    <parameter key="109" value="Embed 110.true.real.attribute"/>
    <parameter key="110" value="Embed 111.true.real.attribute"/>
    <parameter key="111" value="Embed 112.true.real.attribute"/>
    <parameter key="112" value="Embed 113.true.real.attribute"/>
    <parameter key="113" value="Embed 114.true.real.attribute"/>
    <parameter key="114" value="Embed 115.true.real.attribute"/>
    <parameter key="115" value="Embed 116.true.real.attribute"/>
    <parameter key="116" value="Embed 117.true.real.attribute"/>
    <parameter key="117" value="Embed 118.true.real.attribute"/>
    <parameter key="118" value="Embed 119.true.real.attribute"/>
    <parameter key="119" value="Embed 120.true.real.attribute"/>
    <parameter key="120" value="Embed 121.true.real.attribute"/>
    <parameter key="121" value="Embed 122.true.real.attribute"/>
    <parameter key="122" value="Embed 123.true.real.attribute"/>
    <parameter key="123" value="Embed 124.true.real.attribute"/>
    <parameter key="124" value="Embed 125.true.real.attribute"/>
    <parameter key="125" value="Embed 126.true.real.attribute"/>
    <parameter key="126" value="Embed 127.true.real.attribute"/>
    <parameter key="127" value="Embed 128.true.real.attribute"/>
    <parameter key="128" value="Class.true.nominal.attribute"/>
    <parameter key="129" value="Age.true.integer.attribute"/>
    <parameter key="130" value="Height.true.integer.attribute"/>
    <parameter key="131" value="Weight.true.numeric.attribute"/>
    <parameter key="132" value="BMI.true.real.attribute"/>
    <parameter key="133" value="MP.true.nominal.attribute"/>
    <parameter key="134" value="TMD.true.numeric.attribute"/>
    </list>
    <parameter key="read_not_matching_values_as_missings" value="true"/>
    <parameter key="datamanagement" value="double_array"/>
    <parameter key="data_management" value="auto"/>
    </operator>
    <operator activated="true" class="set_role" compatibility="8.0.001" expanded="true" height="82" name="Set Role" width="90" x="179" y="136">
    <parameter key="attribute_name" value="Class"/>
    <parameter key="target_role" value="label"/>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="h2o:deep_learning" compatibility="7.6.001" expanded="true" height="82" name="Deep Learning" width="90" x="313" y="136">
    <parameter key="activation" value="Rectifier"/>
    <enumeration key="hidden_layer_sizes">
    <parameter key="hidden_layer_sizes" value="50"/>
    <parameter key="hidden_layer_sizes" value="50"/>
    </enumeration>
    <enumeration key="hidden_dropout_ratios"/>
    <parameter key="reproducible_(uses_1_thread)" value="false"/>
    <parameter key="use_local_random_seed" value="false"/>
    <parameter key="local_random_seed" value="1992"/>
    <parameter key="epochs" value="10.0"/>
    <parameter key="compute_variable_importances" value="false"/>
    <parameter key="train_samples_per_iteration" value="-2"/>
    <parameter key="adaptive_rate" value="true"/>
    <parameter key="epsilon" value="1.0E-8"/>
    <parameter key="rho" value="0.99"/>
    <parameter key="learning_rate" value="0.005"/>
    <parameter key="learning_rate_annealing" value="1.0E-6"/>
    <parameter key="learning_rate_decay" value="1.0"/>
    <parameter key="momentum_start" value="0.0"/>
    <parameter key="momentum_ramp" value="1000000.0"/>
    <parameter key="momentum_stable" value="0.0"/>
    <parameter key="nesterov_accelerated_gradient" value="true"/>
    <parameter key="standardize" value="true"/>
    <parameter key="L1" value="1.0E-5"/>
    <parameter key="L2" value="0.0"/>
    <parameter key="max_w2" value="10.0"/>
    <parameter key="loss_function" value="Automatic"/>
    <parameter key="distribution_function" value="AUTO"/>
    <parameter key="early_stopping" value="false"/>
    <parameter key="stopping_rounds" value="1"/>
    <parameter key="stopping_metric" value="AUTO"/>
    <parameter key="stopping_tolerance" value="0.001"/>
    <parameter key="missing_values_handling" value="MeanImputation"/>
    <parameter key="max_runtime_seconds" value="0"/>
    <list key="expert_parameters"/>
    <list key="expert_parameters_"/>
    </operator>
    <operator activated="true" class="apply_model" compatibility="8.0.001" expanded="true" height="82" name="Apply Model" width="90" x="447" y="85">
    <list key="application_parameters"/>
    <parameter key="create_view" value="false"/>
    </operator>
    <connect from_op="Read Excel" from_port="output" to_op="Apply Model" to_port="unlabelled data"/>
    <connect from_op="Read Excel (2)" from_port="output" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_op="Deep Learning" to_port="training set"/>
    <connect from_op="Deep Learning" from_port="model" to_op="Apply Model" to_port="model"/>
    <connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
    <connect from_op="Apply Model" from_port="model" to_port="result 2"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    </process>
    </operator>
    </process>
  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member Posts: 1,968  Community Manager

    hi @bsegal - ok I think I understand. So normally we prefer to use cross-validation when building our models to prevent overfitting. And then we measure the performance of the training model, and apply the trained model to the unlabeled data.

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="read_excel" compatibility="8.0.001" expanded="true" height="68" name="Read Excel" width="90" x="380" y="340">
    <parameter key="excel_file" value="/Users/GenzerConsulting/Desktop/openface embeds with demographics.xlsx"/>
    <parameter key="sheet_number" value="3"/>
    <parameter key="imported_cell_range" value="A1:ED41"/>
    <parameter key="first_row_as_names" value="false"/>
    <list key="annotations">
    <parameter key="0" value="Name"/>
    </list>
    <list key="data_set_meta_data_information">
    <parameter key="0" value="Embed 1.true.real.attribute"/>
    <parameter key="1" value="Embed 2.true.real.attribute"/>
    <parameter key="2" value="Embed 3.true.real.attribute"/>
    <parameter key="3" value="Embed 4.true.real.attribute"/>
    <parameter key="4" value="Embed 5.true.real.attribute"/>
    <parameter key="5" value="Embed 6.true.real.attribute"/>
    <parameter key="6" value="Embed 7.true.real.attribute"/>
    <parameter key="7" value="Embed 8.true.real.attribute"/>
    <parameter key="8" value="Embed 9.true.real.attribute"/>
    <parameter key="9" value="Embed 10.true.real.attribute"/>
    <parameter key="10" value="Embed 11.true.real.attribute"/>
    <parameter key="11" value="Embed 12.true.real.attribute"/>
    <parameter key="12" value="Embed 13.true.real.attribute"/>
    <parameter key="13" value="Embed 14.true.real.attribute"/>
    <parameter key="14" value="Embed 15.true.real.attribute"/>
    <parameter key="15" value="Embed 16.true.real.attribute"/>
    <parameter key="16" value="Embed 17.true.real.attribute"/>
    <parameter key="17" value="Embed 18.true.real.attribute"/>
    <parameter key="18" value="Embed 19.true.real.attribute"/>
    <parameter key="19" value="Embed 20.true.real.attribute"/>
    <parameter key="20" value="Embed 21.true.real.attribute"/>
    <parameter key="21" value="Embed 22.true.real.attribute"/>
    <parameter key="22" value="Embed 23.true.real.attribute"/>
    <parameter key="23" value="Embed 24.true.real.attribute"/>
    <parameter key="24" value="Embed 25.true.real.attribute"/>
    <parameter key="25" value="Embed 26.true.real.attribute"/>
    <parameter key="26" value="Embed 27.true.real.attribute"/>
    <parameter key="27" value="Embed 28.true.real.attribute"/>
    <parameter key="28" value="Embed 29.true.real.attribute"/>
    <parameter key="29" value="Embed 30.true.real.attribute"/>
    <parameter key="30" value="Embed 31.true.real.attribute"/>
    <parameter key="31" value="Embed 32.true.real.attribute"/>
    <parameter key="32" value="Embed 33.true.real.attribute"/>
    <parameter key="33" value="Embed 34.true.real.attribute"/>
    <parameter key="34" value="Embed 35.true.real.attribute"/>
    <parameter key="35" value="Embed 36.true.real.attribute"/>
    <parameter key="36" value="Embed 37.true.real.attribute"/>
    <parameter key="37" value="Embed 38.true.real.attribute"/>
    <parameter key="38" value="Embed 39.true.real.attribute"/>
    <parameter key="39" value="Embed 40.true.real.attribute"/>
    <parameter key="40" value="Embed 41.true.real.attribute"/>
    <parameter key="41" value="Embed 42.true.real.attribute"/>
    <parameter key="42" value="Embed 43.true.real.attribute"/>
    <parameter key="43" value="Embed 44.true.real.attribute"/>
    <parameter key="44" value="Embed 45.true.real.attribute"/>
    <parameter key="45" value="Embed 46.true.real.attribute"/>
    <parameter key="46" value="Embed 47.true.real.attribute"/>
    <parameter key="47" value="Embed 48.true.real.attribute"/>
    <parameter key="48" value="Embed 49.true.real.attribute"/>
    <parameter key="49" value="Embed 50.true.real.attribute"/>
    <parameter key="50" value="Embed 51.true.real.attribute"/>
    <parameter key="51" value="Embed 52.true.real.attribute"/>
    <parameter key="52" value="Embed 53.true.real.attribute"/>
    <parameter key="53" value="Embed 54.true.real.attribute"/>
    <parameter key="54" value="Embed 55.true.real.attribute"/>
    <parameter key="55" value="Embed 56.true.real.attribute"/>
    <parameter key="56" value="Embed 57.true.real.attribute"/>
    <parameter key="57" value="Embed 58.true.real.attribute"/>
    <parameter key="58" value="Embed 59.true.real.attribute"/>
    <parameter key="59" value="Embed 60.true.real.attribute"/>
    <parameter key="60" value="Embed 61.true.real.attribute"/>
    <parameter key="61" value="Embed 62.true.real.attribute"/>
    <parameter key="62" value="Embed 63.true.real.attribute"/>
    <parameter key="63" value="Embed 64.true.real.attribute"/>
    <parameter key="64" value="Embed 65.true.real.attribute"/>
    <parameter key="65" value="Embed 66.true.real.attribute"/>
    <parameter key="66" value="Embed 67.true.real.attribute"/>
    <parameter key="67" value="Embed 68.true.real.attribute"/>
    <parameter key="68" value="Embed 69.true.real.attribute"/>
    <parameter key="69" value="Embed 70.true.real.attribute"/>
    <parameter key="70" value="Embed 71.true.real.attribute"/>
    <parameter key="71" value="Embed 72.true.real.attribute"/>
    <parameter key="72" value="Embed 73.true.real.attribute"/>
    <parameter key="73" value="Embed 74.true.real.attribute"/>
    <parameter key="74" value="Embed 75.true.real.attribute"/>
    <parameter key="75" value="Embed 76.true.real.attribute"/>
    <parameter key="76" value="Embed 77.true.real.attribute"/>
    <parameter key="77" value="Embed 78.true.real.attribute"/>
    <parameter key="78" value="Embed 79.true.real.attribute"/>
    <parameter key="79" value="Embed 80.true.real.attribute"/>
    <parameter key="80" value="Embed 81.true.real.attribute"/>
    <parameter key="81" value="Embed 82.true.real.attribute"/>
    <parameter key="82" value="Embed 83.true.real.attribute"/>
    <parameter key="83" value="Embed 84.true.real.attribute"/>
    <parameter key="84" value="Embed 85.true.real.attribute"/>
    <parameter key="85" value="Embed 86.true.real.attribute"/>
    <parameter key="86" value="Embed 87.true.real.attribute"/>
    <parameter key="87" value="Embed 88.true.real.attribute"/>
    <parameter key="88" value="Embed 89.true.real.attribute"/>
    <parameter key="89" value="Embed 90.true.real.attribute"/>
    <parameter key="90" value="Embed 91.true.real.attribute"/>
    <parameter key="91" value="Embed 92.true.real.attribute"/>
    <parameter key="92" value="Embed 93.true.real.attribute"/>
    <parameter key="93" value="Embed 94.true.real.attribute"/>
    <parameter key="94" value="Embed 95.true.real.attribute"/>
    <parameter key="95" value="Embed 96.true.real.attribute"/>
    <parameter key="96" value="Embed 97.true.real.attribute"/>
    <parameter key="97" value="Embed 98.true.real.attribute"/>
    <parameter key="98" value="Embed 99.true.real.attribute"/>
    <parameter key="99" value="Embed 100.true.real.attribute"/>
    <parameter key="100" value="Embed 101.true.real.attribute"/>
    <parameter key="101" value="Embed 102.true.real.attribute"/>
    <parameter key="102" value="Embed 103.true.real.attribute"/>
    <parameter key="103" value="Embed 104.true.real.attribute"/>
    <parameter key="104" value="Embed 105.true.real.attribute"/>
    <parameter key="105" value="Embed 106.true.real.attribute"/>
    <parameter key="106" value="Embed 107.true.real.attribute"/>
    <parameter key="107" value="Embed 108.true.real.attribute"/>
    <parameter key="108" value="Embed 109.true.real.attribute"/>
    <parameter key="109" value="Embed 110.true.real.attribute"/>
    <parameter key="110" value="Embed 111.true.real.attribute"/>
    <parameter key="111" value="Embed 112.true.real.attribute"/>
    <parameter key="112" value="Embed 113.true.real.attribute"/>
    <parameter key="113" value="Embed 114.true.real.attribute"/>
    <parameter key="114" value="Embed 115.true.real.attribute"/>
    <parameter key="115" value="Embed 116.true.real.attribute"/>
    <parameter key="116" value="Embed 117.true.real.attribute"/>
    <parameter key="117" value="Embed 118.true.real.attribute"/>
    <parameter key="118" value="Embed 119.true.real.attribute"/>
    <parameter key="119" value="Embed 120.true.real.attribute"/>
    <parameter key="120" value="Embed 121.true.real.attribute"/>
    <parameter key="121" value="Embed 122.true.real.attribute"/>
    <parameter key="122" value="Embed 123.true.real.attribute"/>
    <parameter key="123" value="Embed 124.true.real.attribute"/>
    <parameter key="124" value="Embed 125.true.real.attribute"/>
    <parameter key="125" value="Embed 126.true.real.attribute"/>
    <parameter key="126" value="Embed 127.true.real.attribute"/>
    <parameter key="127" value="Embed 128.true.real.attribute"/>
    <parameter key="128" value="Age.true.integer.attribute"/>
    <parameter key="129" value="Height.true.integer.attribute"/>
    <parameter key="130" value="Weight.true.integer.attribute"/>
    <parameter key="131" value="BMI.true.numeric.attribute"/>
    <parameter key="132" value="MP.true.nominal.attribute"/>
    <parameter key="133" value="TMD.true.numeric.attribute"/>
    </list>
    </operator>
    <operator activated="true" class="read_excel" compatibility="8.0.001" expanded="true" height="68" name="Read Excel (2)" width="90" x="45" y="136">
    <parameter key="excel_file" value="/Users/GenzerConsulting/Desktop/openface embeds with demographics.xlsx"/>
    <parameter key="sheet_number" value="2"/>
    <parameter key="imported_cell_range" value="A1:EE41"/>
    <parameter key="first_row_as_names" value="false"/>
    <list key="annotations">
    <parameter key="0" value="Name"/>
    </list>
    <list key="data_set_meta_data_information">
    <parameter key="0" value="Embed 1.true.real.attribute"/>
    <parameter key="1" value="Embed 2.true.real.attribute"/>
    <parameter key="2" value="Embed 3.true.real.attribute"/>
    <parameter key="3" value="Embed 4.true.real.attribute"/>
    <parameter key="4" value="Embed 5.true.real.attribute"/>
    <parameter key="5" value="Embed 6.true.real.attribute"/>
    <parameter key="6" value="Embed 7.true.real.attribute"/>
    <parameter key="7" value="Embed 8.true.real.attribute"/>
    <parameter key="8" value="Embed 9.true.real.attribute"/>
    <parameter key="9" value="Embed 10.true.real.attribute"/>
    <parameter key="10" value="Embed 11.true.real.attribute"/>
    <parameter key="11" value="Embed 12.true.real.attribute"/>
    <parameter key="12" value="Embed 13.true.real.attribute"/>
    <parameter key="13" value="Embed 14.true.real.attribute"/>
    <parameter key="14" value="Embed 15.true.real.attribute"/>
    <parameter key="15" value="Embed 16.true.real.attribute"/>
    <parameter key="16" value="Embed 17.true.real.attribute"/>
    <parameter key="17" value="Embed 18.true.real.attribute"/>
    <parameter key="18" value="Embed 19.true.real.attribute"/>
    <parameter key="19" value="Embed 20.true.real.attribute"/>
    <parameter key="20" value="Embed 21.true.real.attribute"/>
    <parameter key="21" value="Embed 22.true.real.attribute"/>
    <parameter key="22" value="Embed 23.true.real.attribute"/>
    <parameter key="23" value="Embed 24.true.real.attribute"/>
    <parameter key="24" value="Embed 25.true.real.attribute"/>
    <parameter key="25" value="Embed 26.true.real.attribute"/>
    <parameter key="26" value="Embed 27.true.real.attribute"/>
    <parameter key="27" value="Embed 28.true.real.attribute"/>
    <parameter key="28" value="Embed 29.true.real.attribute"/>
    <parameter key="29" value="Embed 30.true.real.attribute"/>
    <parameter key="30" value="Embed 31.true.real.attribute"/>
    <parameter key="31" value="Embed 32.true.real.attribute"/>
    <parameter key="32" value="Embed 33.true.real.attribute"/>
    <parameter key="33" value="Embed 34.true.real.attribute"/>
    <parameter key="34" value="Embed 35.true.real.attribute"/>
    <parameter key="35" value="Embed 36.true.real.attribute"/>
    <parameter key="36" value="Embed 37.true.real.attribute"/>
    <parameter key="37" value="Embed 38.true.real.attribute"/>
    <parameter key="38" value="Embed 39.true.real.attribute"/>
    <parameter key="39" value="Embed 40.true.real.attribute"/>
    <parameter key="40" value="Embed 41.true.real.attribute"/>
    <parameter key="41" value="Embed 42.true.real.attribute"/>
    <parameter key="42" value="Embed 43.true.real.attribute"/>
    <parameter key="43" value="Embed 44.true.real.attribute"/>
    <parameter key="44" value="Embed 45.true.real.attribute"/>
    <parameter key="45" value="Embed 46.true.real.attribute"/>
    <parameter key="46" value="Embed 47.true.real.attribute"/>
    <parameter key="47" value="Embed 48.true.real.attribute"/>
    <parameter key="48" value="Embed 49.true.real.attribute"/>
    <parameter key="49" value="Embed 50.true.real.attribute"/>
    <parameter key="50" value="Embed 51.true.real.attribute"/>
    <parameter key="51" value="Embed 52.true.real.attribute"/>
    <parameter key="52" value="Embed 53.true.real.attribute"/>
    <parameter key="53" value="Embed 54.true.real.attribute"/>
    <parameter key="54" value="Embed 55.true.real.attribute"/>
    <parameter key="55" value="Embed 56.true.real.attribute"/>
    <parameter key="56" value="Embed 57.true.real.attribute"/>
    <parameter key="57" value="Embed 58.true.real.attribute"/>
    <parameter key="58" value="Embed 59.true.real.attribute"/>
    <parameter key="59" value="Embed 60.true.real.attribute"/>
    <parameter key="60" value="Embed 61.true.real.attribute"/>
    <parameter key="61" value="Embed 62.true.real.attribute"/>
    <parameter key="62" value="Embed 63.true.real.attribute"/>
    <parameter key="63" value="Embed 64.true.real.attribute"/>
    <parameter key="64" value="Embed 65.true.real.attribute"/>
    <parameter key="65" value="Embed 66.true.real.attribute"/>
    <parameter key="66" value="Embed 67.true.real.attribute"/>
    <parameter key="67" value="Embed 68.true.real.attribute"/>
    <parameter key="68" value="Embed 69.true.real.attribute"/>
    <parameter key="69" value="Embed 70.true.real.attribute"/>
    <parameter key="70" value="Embed 71.true.real.attribute"/>
    <parameter key="71" value="Embed 72.true.real.attribute"/>
    <parameter key="72" value="Embed 73.true.real.attribute"/>
    <parameter key="73" value="Embed 74.true.real.attribute"/>
    <parameter key="74" value="Embed 75.true.real.attribute"/>
    <parameter key="75" value="Embed 76.true.real.attribute"/>
    <parameter key="76" value="Embed 77.true.real.attribute"/>
    <parameter key="77" value="Embed 78.true.real.attribute"/>
    <parameter key="78" value="Embed 79.true.real.attribute"/>
    <parameter key="79" value="Embed 80.true.real.attribute"/>
    <parameter key="80" value="Embed 81.true.real.attribute"/>
    <parameter key="81" value="Embed 82.true.real.attribute"/>
    <parameter key="82" value="Embed 83.true.real.attribute"/>
    <parameter key="83" value="Embed 84.true.real.attribute"/>
    <parameter key="84" value="Embed 85.true.real.attribute"/>
    <parameter key="85" value="Embed 86.true.real.attribute"/>
    <parameter key="86" value="Embed 87.true.real.attribute"/>
    <parameter key="87" value="Embed 88.true.real.attribute"/>
    <parameter key="88" value="Embed 89.true.real.attribute"/>
    <parameter key="89" value="Embed 90.true.real.attribute"/>
    <parameter key="90" value="Embed 91.true.real.attribute"/>
    <parameter key="91" value="Embed 92.true.real.attribute"/>
    <parameter key="92" value="Embed 93.true.real.attribute"/>
    <parameter key="93" value="Embed 94.true.real.attribute"/>
    <parameter key="94" value="Embed 95.true.real.attribute"/>
    <parameter key="95" value="Embed 96.true.real.attribute"/>
    <parameter key="96" value="Embed 97.true.real.attribute"/>
    <parameter key="97" value="Embed 98.true.real.attribute"/>
    <parameter key="98" value="Embed 99.true.real.attribute"/>
    <parameter key="99" value="Embed 100.true.real.attribute"/>
    <parameter key="100" value="Embed 101.true.real.attribute"/>
    <parameter key="101" value="Embed 102.true.real.attribute"/>
    <parameter key="102" value="Embed 103.true.real.attribute"/>
    <parameter key="103" value="Embed 104.true.real.attribute"/>
    <parameter key="104" value="Embed 105.true.real.attribute"/>
    <parameter key="105" value="Embed 106.true.real.attribute"/>
    <parameter key="106" value="Embed 107.true.real.attribute"/>
    <parameter key="107" value="Embed 108.true.real.attribute"/>
    <parameter key="108" value="Embed 109.true.real.attribute"/>
    <parameter key="109" value="Embed 110.true.real.attribute"/>
    <parameter key="110" value="Embed 111.true.real.attribute"/>
    <parameter key="111" value="Embed 112.true.real.attribute"/>
    <parameter key="112" value="Embed 113.true.real.attribute"/>
    <parameter key="113" value="Embed 114.true.real.attribute"/>
    <parameter key="114" value="Embed 115.true.real.attribute"/>
    <parameter key="115" value="Embed 116.true.real.attribute"/>
    <parameter key="116" value="Embed 117.true.real.attribute"/>
    <parameter key="117" value="Embed 118.true.real.attribute"/>
    <parameter key="118" value="Embed 119.true.real.attribute"/>
    <parameter key="119" value="Embed 120.true.real.attribute"/>
    <parameter key="120" value="Embed 121.true.real.attribute"/>
    <parameter key="121" value="Embed 122.true.real.attribute"/>
    <parameter key="122" value="Embed 123.true.real.attribute"/>
    <parameter key="123" value="Embed 124.true.real.attribute"/>
    <parameter key="124" value="Embed 125.true.real.attribute"/>
    <parameter key="125" value="Embed 126.true.real.attribute"/>
    <parameter key="126" value="Embed 127.true.real.attribute"/>
    <parameter key="127" value="Embed 128.true.real.attribute"/>
    <parameter key="128" value="Class.true.nominal.attribute"/>
    <parameter key="129" value="Age.true.integer.attribute"/>
    <parameter key="130" value="Height.true.integer.attribute"/>
    <parameter key="131" value="Weight.true.numeric.attribute"/>
    <parameter key="132" value="BMI.true.real.attribute"/>
    <parameter key="133" value="MP.true.nominal.attribute"/>
    <parameter key="134" value="TMD.true.numeric.attribute"/>
    </list>
    </operator>
    <operator activated="true" class="set_role" compatibility="8.0.001" expanded="true" height="82" name="Set Role" width="90" x="179" y="136">
    <parameter key="attribute_name" value="Class"/>
    <parameter key="target_role" value="label"/>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="concurrency:cross_validation" compatibility="8.0.001" expanded="true" height="145" name="Cross Validation" width="90" x="313" y="136">
    <process expanded="true">
    <operator activated="true" class="h2o:deep_learning" compatibility="7.6.001" expanded="true" height="82" name="Deep Learning" width="90" x="45" y="34">
    <enumeration key="hidden_layer_sizes">
    <parameter key="hidden_layer_sizes" value="50"/>
    <parameter key="hidden_layer_sizes" value="50"/>
    </enumeration>
    <enumeration key="hidden_dropout_ratios"/>
    <list key="expert_parameters"/>
    <list key="expert_parameters_"/>
    </operator>
    <connect from_port="training set" to_op="Deep Learning" to_port="training set"/>
    <connect from_op="Deep Learning" from_port="model" to_port="model"/>
    <portSpacing port="source_training set" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    <portSpacing port="sink_through 1" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="apply_model" compatibility="8.0.001" expanded="true" height="82" name="Apply Model (2)" width="90" x="112" y="34">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="performance_classification" compatibility="8.0.001" expanded="true" height="82" name="Performance" width="90" x="514" y="34">
    <list key="class_weights"/>
    </operator>
    <connect from_port="model" to_op="Apply Model (2)" to_port="model"/>
    <connect from_port="test set" to_op="Apply Model (2)" to_port="unlabelled data"/>
    <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
    <connect from_op="Performance" from_port="performance" to_port="performance 1"/>
    <portSpacing port="source_model" spacing="0"/>
    <portSpacing port="source_test set" spacing="0"/>
    <portSpacing port="source_through 1" spacing="0"/>
    <portSpacing port="sink_test set results" spacing="0"/>
    <portSpacing port="sink_performance 1" spacing="0"/>
    <portSpacing port="sink_performance 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="apply_model" compatibility="8.0.001" expanded="true" height="82" name="Apply Model" width="90" x="514" y="34">
    <list key="application_parameters"/>
    </operator>
    <connect from_op="Read Excel" from_port="output" to_op="Apply Model" to_port="unlabelled data"/>
    <connect from_op="Read Excel (2)" from_port="output" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_op="Cross Validation" to_port="example set"/>
    <connect from_op="Cross Validation" from_port="model" to_op="Apply Model" to_port="model"/>
    <connect from_op="Cross Validation" from_port="performance 1" to_port="result 2"/>
    <connect from_op="Apply Model" from_port="model" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    </process>
    </operator>
    </process>

    Scott

     

  • bsegalbsegal Member Posts: 7 Contributor II

    Thank you again.  Sorry to persist, but i'm really trying to learn how to do this correctly.  When I look at the results under Performance Vector, this appears to be how well it fit the training set in the cross validation step, rather than how it performed on the separate validation data (tab 3 in the excel file).  What I am trying to do is train the model on a set of data, and then validate it (instead of an n-fold cross validation) on another set of data that was not used to train the model.

     

    A little background if this helps:  the first 128 columns are the output of an open source deep neural net that analyzes facial photographs and ouputs these embeddings.  We are adding a few fields of data describing the subject which were part of a medical research study.  The outcome is difficulty in performing a medical procedure known as endotracheal intubation.  We're trying to predict difficulty based on facial appearance.  So I'm hoping to train the model on a bunch of easy and hard cases, and then test the model on a different set of data.

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member Posts: 1,968  Community Manager

    hi @bsegal - no problem at all.  Thanks for the background as it always helps to understand the use case. So the reason I'm not showing the performance of the validation set is exactly because it's unlabeled (hard/easy). How would we able to measure the performance of a model if we don't know with what we are comparing?

    So another way to skin this cat would be to split the training set (usually we use 80/20) and create a model with the larger piece (with cross-validation) and then test the performance with the remaining smaller piece. Then, once we're satisfied with the performance of the model, we can apply the model on unlabeled data to make informed, probabilistic predictions - the "validation set" tab in your case.

     

    Does this help at all?

     

    Scott

     

    [EDIT: on a side note, you have < 100 rows of data in your training set which is almost impossible to use to train any kind of decent model. Hopefully you have more data hiding somewhere?]

  • bsegalbsegal Member Posts: 7 Contributor II

    Scott, thanks for your patience.  So yes, of course, we do have the actual results for the validation step.  If you look at tab 1 of the excel file, I have all 80 cases, but I had manually stripped out the actual before sending it to the apply model block.  We arbitrarily divided them in our original study into 40/40.  In that study we used a supervised facial analysis model that required human intervention to jump start the fitting; here we are trying to skip the human intervention.

     

    So it sounds like you would not recommend what I'm doing (50/50 split of the cases, with one for training and one for testing) but rather combine all 80 cases and use a 10 fold cross validation step instead?

     

    But even with the risk of overfitting, is it possible to program RapidMiner to do what we were trying (the half and half split)?

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member Posts: 1,968  Community Manager

    ah ok! Silly me - should have looked at the first tab and read your query better.  My apologies.  Sometimes I jump before looking...

     

    So yes, my feeling is you're at risk of overfitting with so few data - particularly with an algorithm like Deep Learning which is prone to overfitting in general. I like the selection of DL as a model in general when looking at sets like yours due to its inherent feature selection properties, but I think you're using a tool that does not fit your current data resources.  For initial data model selection, I always recommend using Ingo's amazing mod.rapidminer.com. If I insert your information, I get Decision Tree and Naive Bayes as models that will most likely serve your purposes better.

     

    Here's what I would try:

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" breakpoints="after" class="read_excel" compatibility="8.0.001" expanded="true" height="68" name="Read Excel (2)" width="90" x="45" y="136">
    <parameter key="excel_file" value="/Users/GenzerConsulting/Desktop/openface embeds with demographics.xlsx"/>
    <parameter key="imported_cell_range" value="A1:EE81"/>
    <parameter key="first_row_as_names" value="false"/>
    <list key="annotations">
    <parameter key="0" value="Name"/>
    </list>
    <list key="data_set_meta_data_information">
    <parameter key="0" value="MRN.true.integer.id"/>
    <parameter key="1" value="1\.000.true.real.attribute"/>
    <parameter key="2" value="2\.000.true.real.attribute"/>
    <parameter key="3" value="3\.000.true.real.attribute"/>
    <parameter key="4" value="4\.000.true.real.attribute"/>
    <parameter key="5" value="5\.000.true.real.attribute"/>
    <parameter key="6" value="6\.000.true.real.attribute"/>
    <parameter key="7" value="7\.000.true.real.attribute"/>
    <parameter key="8" value="8\.000.true.real.attribute"/>
    <parameter key="9" value="9\.000.true.real.attribute"/>
    <parameter key="10" value="10\.000.true.real.attribute"/>
    <parameter key="11" value="11\.000.true.real.attribute"/>
    <parameter key="12" value="12\.000.true.real.attribute"/>
    <parameter key="13" value="13\.000.true.real.attribute"/>
    <parameter key="14" value="14\.000.true.real.attribute"/>
    <parameter key="15" value="15\.000.true.real.attribute"/>
    <parameter key="16" value="16\.000.true.real.attribute"/>
    <parameter key="17" value="17\.000.true.real.attribute"/>
    <parameter key="18" value="18\.000.true.real.attribute"/>
    <parameter key="19" value="19\.000.true.real.attribute"/>
    <parameter key="20" value="20\.000.true.real.attribute"/>
    <parameter key="21" value="21\.000.true.real.attribute"/>
    <parameter key="22" value="22\.000.true.real.attribute"/>
    <parameter key="23" value="23\.000.true.real.attribute"/>
    <parameter key="24" value="24\.000.true.real.attribute"/>
    <parameter key="25" value="25\.000.true.real.attribute"/>
    <parameter key="26" value="26\.000.true.real.attribute"/>
    <parameter key="27" value="27\.000.true.real.attribute"/>
    <parameter key="28" value="28\.000.true.real.attribute"/>
    <parameter key="29" value="29\.000.true.real.attribute"/>
    <parameter key="30" value="30\.000.true.real.attribute"/>
    <parameter key="31" value="31\.000.true.real.attribute"/>
    <parameter key="32" value="32\.000.true.real.attribute"/>
    <parameter key="33" value="33\.000.true.real.attribute"/>
    <parameter key="34" value="34\.000.true.real.attribute"/>
    <parameter key="35" value="35\.000.true.real.attribute"/>
    <parameter key="36" value="36\.000.true.real.attribute"/>
    <parameter key="37" value="37\.000.true.real.attribute"/>
    <parameter key="38" value="38\.000.true.real.attribute"/>
    <parameter key="39" value="39\.000.true.real.attribute"/>
    <parameter key="40" value="40\.000.true.real.attribute"/>
    <parameter key="41" value="41\.000.true.real.attribute"/>
    <parameter key="42" value="42\.000.true.real.attribute"/>
    <parameter key="43" value="43\.000.true.real.attribute"/>
    <parameter key="44" value="44\.000.true.real.attribute"/>
    <parameter key="45" value="45\.000.true.real.attribute"/>
    <parameter key="46" value="46\.000.true.real.attribute"/>
    <parameter key="47" value="47\.000.true.real.attribute"/>
    <parameter key="48" value="48\.000.true.real.attribute"/>
    <parameter key="49" value="49\.000.true.real.attribute"/>
    <parameter key="50" value="50\.000.true.real.attribute"/>
    <parameter key="51" value="51\.000.true.real.attribute"/>
    <parameter key="52" value="52\.000.true.real.attribute"/>
    <parameter key="53" value="53\.000.true.real.attribute"/>
    <parameter key="54" value="54\.000.true.real.attribute"/>
    <parameter key="55" value="55\.000.true.real.attribute"/>
    <parameter key="56" value="56\.000.true.real.attribute"/>
    <parameter key="57" value="57\.000.true.real.attribute"/>
    <parameter key="58" value="58\.000.true.real.attribute"/>
    <parameter key="59" value="59\.000.true.real.attribute"/>
    <parameter key="60" value="60\.000.true.real.attribute"/>
    <parameter key="61" value="61\.000.true.real.attribute"/>
    <parameter key="62" value="62\.000.true.real.attribute"/>
    <parameter key="63" value="63\.000.true.real.attribute"/>
    <parameter key="64" value="64\.000.true.real.attribute"/>
    <parameter key="65" value="65\.000.true.real.attribute"/>
    <parameter key="66" value="66\.000.true.real.attribute"/>
    <parameter key="67" value="67\.000.true.real.attribute"/>
    <parameter key="68" value="68\.000.true.real.attribute"/>
    <parameter key="69" value="69\.000.true.real.attribute"/>
    <parameter key="70" value="70\.000.true.real.attribute"/>
    <parameter key="71" value="71\.000.true.real.attribute"/>
    <parameter key="72" value="72\.000.true.real.attribute"/>
    <parameter key="73" value="73\.000.true.real.attribute"/>
    <parameter key="74" value="74\.000.true.real.attribute"/>
    <parameter key="75" value="75\.000.true.real.attribute"/>
    <parameter key="76" value="76\.000.true.real.attribute"/>
    <parameter key="77" value="77\.000.true.real.attribute"/>
    <parameter key="78" value="78\.000.true.real.attribute"/>
    <parameter key="79" value="79\.000.true.real.attribute"/>
    <parameter key="80" value="80\.000.true.real.attribute"/>
    <parameter key="81" value="81\.000.true.real.attribute"/>
    <parameter key="82" value="82\.000.true.real.attribute"/>
    <parameter key="83" value="83\.000.true.real.attribute"/>
    <parameter key="84" value="84\.000.true.real.attribute"/>
    <parameter key="85" value="85\.000.true.real.attribute"/>
    <parameter key="86" value="86\.000.true.real.attribute"/>
    <parameter key="87" value="87\.000.true.real.attribute"/>
    <parameter key="88" value="88\.000.true.real.attribute"/>
    <parameter key="89" value="89\.000.true.real.attribute"/>
    <parameter key="90" value="90\.000.true.real.attribute"/>
    <parameter key="91" value="91\.000.true.real.attribute"/>
    <parameter key="92" value="92\.000.true.real.attribute"/>
    <parameter key="93" value="93\.000.true.real.attribute"/>
    <parameter key="94" value="94\.000.true.real.attribute"/>
    <parameter key="95" value="95\.000.true.real.attribute"/>
    <parameter key="96" value="96\.000.true.real.attribute"/>
    <parameter key="97" value="97\.000.true.real.attribute"/>
    <parameter key="98" value="98\.000.true.real.attribute"/>
    <parameter key="99" value="99\.000.true.real.attribute"/>
    <parameter key="100" value="100\.000.true.real.attribute"/>
    <parameter key="101" value="101\.000.true.real.attribute"/>
    <parameter key="102" value="102\.000.true.real.attribute"/>
    <parameter key="103" value="103\.000.true.real.attribute"/>
    <parameter key="104" value="104\.000.true.real.attribute"/>
    <parameter key="105" value="105\.000.true.real.attribute"/>
    <parameter key="106" value="106\.000.true.real.attribute"/>
    <parameter key="107" value="107\.000.true.real.attribute"/>
    <parameter key="108" value="108\.000.true.real.attribute"/>
    <parameter key="109" value="109\.000.true.real.attribute"/>
    <parameter key="110" value="110\.000.true.real.attribute"/>
    <parameter key="111" value="111\.000.true.real.attribute"/>
    <parameter key="112" value="112\.000.true.real.attribute"/>
    <parameter key="113" value="113\.000.true.real.attribute"/>
    <parameter key="114" value="114\.000.true.real.attribute"/>
    <parameter key="115" value="115\.000.true.real.attribute"/>
    <parameter key="116" value="116\.000.true.real.attribute"/>
    <parameter key="117" value="117\.000.true.real.attribute"/>
    <parameter key="118" value="118\.000.true.real.attribute"/>
    <parameter key="119" value="119\.000.true.real.attribute"/>
    <parameter key="120" value="120\.000.true.real.attribute"/>
    <parameter key="121" value="121\.000.true.real.attribute"/>
    <parameter key="122" value="122\.000.true.real.attribute"/>
    <parameter key="123" value="123\.000.true.real.attribute"/>
    <parameter key="124" value="124\.000.true.real.attribute"/>
    <parameter key="125" value="125\.000.true.real.attribute"/>
    <parameter key="126" value="126\.000.true.real.attribute"/>
    <parameter key="127" value="127\.000.true.real.attribute"/>
    <parameter key="128" value="128\.000.true.real.attribute"/>
    <parameter key="129" value="Class.true.binominal.label"/>
    <parameter key="130" value="Age.true.integer.attribute"/>
    <parameter key="131" value="Height.true.integer.attribute"/>
    <parameter key="132" value="Weight.true.numeric.attribute"/>
    <parameter key="133" value="BMI.true.real.attribute"/>
    <parameter key="134" value="MP.true.integer.attribute"/>
    </list>
    </operator>
    <operator activated="true" class="split_data" compatibility="8.0.001" expanded="true" height="103" name="Split Data" width="90" x="179" y="340">
    <enumeration key="partitions">
    <parameter key="ratio" value="0.5"/>
    <parameter key="ratio" value="0.5"/>
    </enumeration>
    </operator>
    <operator activated="true" class="concurrency:cross_validation" compatibility="8.0.001" expanded="true" height="145" name="Cross Validation" width="90" x="380" y="136">
    <process expanded="true">
    <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="8.0.001" expanded="true" height="103" name="Decision Tree" width="90" x="112" y="34"/>
    <connect from_port="training set" to_op="Decision Tree" to_port="training set"/>
    <connect from_op="Decision Tree" from_port="model" to_port="model"/>
    <portSpacing port="source_training set" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    <portSpacing port="sink_through 1" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="apply_model" compatibility="8.0.001" expanded="true" height="82" name="Apply Model (2)" width="90" x="112" y="34">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="performance_classification" compatibility="8.0.001" expanded="true" height="82" name="Performance" width="90" x="514" y="34">
    <list key="class_weights"/>
    </operator>
    <connect from_port="model" to_op="Apply Model (2)" to_port="model"/>
    <connect from_port="test set" to_op="Apply Model (2)" to_port="unlabelled data"/>
    <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
    <connect from_op="Performance" from_port="performance" to_port="performance 1"/>
    <portSpacing port="source_model" spacing="0"/>
    <portSpacing port="source_test set" spacing="0"/>
    <portSpacing port="source_through 1" spacing="0"/>
    <portSpacing port="sink_test set results" spacing="0"/>
    <portSpacing port="sink_performance 1" spacing="0"/>
    <portSpacing port="sink_performance 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="apply_model" compatibility="8.0.001" expanded="true" height="82" name="Apply Model" width="90" x="514" y="340">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="performance_classification" compatibility="8.0.001" expanded="true" height="82" name="Performance (2)" width="90" x="648" y="289">
    <list key="class_weights"/>
    </operator>
    <connect from_op="Read Excel (2)" from_port="output" to_op="Split Data" to_port="example set"/>
    <connect from_op="Split Data" from_port="partition 1" to_op="Cross Validation" to_port="example set"/>
    <connect from_op="Split Data" from_port="partition 2" to_op="Apply Model" to_port="unlabelled data"/>
    <connect from_op="Cross Validation" from_port="model" to_op="Apply Model" to_port="model"/>
    <connect from_op="Apply Model" from_port="labelled data" to_op="Performance (2)" to_port="labelled data"/>
    <connect from_op="Apply Model" from_port="model" to_port="result 2"/>
    <connect from_op="Performance (2)" from_port="performance" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    </process>
    </operator>
    </process>

    Scott

     

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member Posts: 1,968  Community Manager

    EDIT - just to be clear, what I'm showing you is a process that you use once you have more data. With only your 80 rows, yes, you can try putting everything in 10-fold x-validation but again you're not going to get very good results.  49% accuracy with binary classes is worse than flipping a coin.

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" breakpoints="after" class="read_excel" compatibility="8.0.001" expanded="true" height="68" name="Read Excel (2)" width="90" x="45" y="136">
    <parameter key="excel_file" value="/Users/GenzerConsulting/Desktop/openface embeds with demographics.xlsx"/>
    <parameter key="imported_cell_range" value="A1:EE81"/>
    <parameter key="first_row_as_names" value="false"/>
    <list key="annotations">
    <parameter key="0" value="Name"/>
    </list>
    <list key="data_set_meta_data_information">
    <parameter key="0" value="MRN.true.integer.id"/>
    <parameter key="1" value="1\.000.true.real.attribute"/>
    <parameter key="2" value="2\.000.true.real.attribute"/>
    <parameter key="3" value="3\.000.true.real.attribute"/>
    <parameter key="4" value="4\.000.true.real.attribute"/>
    <parameter key="5" value="5\.000.true.real.attribute"/>
    <parameter key="6" value="6\.000.true.real.attribute"/>
    <parameter key="7" value="7\.000.true.real.attribute"/>
    <parameter key="8" value="8\.000.true.real.attribute"/>
    <parameter key="9" value="9\.000.true.real.attribute"/>
    <parameter key="10" value="10\.000.true.real.attribute"/>
    <parameter key="11" value="11\.000.true.real.attribute"/>
    <parameter key="12" value="12\.000.true.real.attribute"/>
    <parameter key="13" value="13\.000.true.real.attribute"/>
    <parameter key="14" value="14\.000.true.real.attribute"/>
    <parameter key="15" value="15\.000.true.real.attribute"/>
    <parameter key="16" value="16\.000.true.real.attribute"/>
    <parameter key="17" value="17\.000.true.real.attribute"/>
    <parameter key="18" value="18\.000.true.real.attribute"/>
    <parameter key="19" value="19\.000.true.real.attribute"/>
    <parameter key="20" value="20\.000.true.real.attribute"/>
    <parameter key="21" value="21\.000.true.real.attribute"/>
    <parameter key="22" value="22\.000.true.real.attribute"/>
    <parameter key="23" value="23\.000.true.real.attribute"/>
    <parameter key="24" value="24\.000.true.real.attribute"/>
    <parameter key="25" value="25\.000.true.real.attribute"/>
    <parameter key="26" value="26\.000.true.real.attribute"/>
    <parameter key="27" value="27\.000.true.real.attribute"/>
    <parameter key="28" value="28\.000.true.real.attribute"/>
    <parameter key="29" value="29\.000.true.real.attribute"/>
    <parameter key="30" value="30\.000.true.real.attribute"/>
    <parameter key="31" value="31\.000.true.real.attribute"/>
    <parameter key="32" value="32\.000.true.real.attribute"/>
    <parameter key="33" value="33\.000.true.real.attribute"/>
    <parameter key="34" value="34\.000.true.real.attribute"/>
    <parameter key="35" value="35\.000.true.real.attribute"/>
    <parameter key="36" value="36\.000.true.real.attribute"/>
    <parameter key="37" value="37\.000.true.real.attribute"/>
    <parameter key="38" value="38\.000.true.real.attribute"/>
    <parameter key="39" value="39\.000.true.real.attribute"/>
    <parameter key="40" value="40\.000.true.real.attribute"/>
    <parameter key="41" value="41\.000.true.real.attribute"/>
    <parameter key="42" value="42\.000.true.real.attribute"/>
    <parameter key="43" value="43\.000.true.real.attribute"/>
    <parameter key="44" value="44\.000.true.real.attribute"/>
    <parameter key="45" value="45\.000.true.real.attribute"/>
    <parameter key="46" value="46\.000.true.real.attribute"/>
    <parameter key="47" value="47\.000.true.real.attribute"/>
    <parameter key="48" value="48\.000.true.real.attribute"/>
    <parameter key="49" value="49\.000.true.real.attribute"/>
    <parameter key="50" value="50\.000.true.real.attribute"/>
    <parameter key="51" value="51\.000.true.real.attribute"/>
    <parameter key="52" value="52\.000.true.real.attribute"/>
    <parameter key="53" value="53\.000.true.real.attribute"/>
    <parameter key="54" value="54\.000.true.real.attribute"/>
    <parameter key="55" value="55\.000.true.real.attribute"/>
    <parameter key="56" value="56\.000.true.real.attribute"/>
    <parameter key="57" value="57\.000.true.real.attribute"/>
    <parameter key="58" value="58\.000.true.real.attribute"/>
    <parameter key="59" value="59\.000.true.real.attribute"/>
    <parameter key="60" value="60\.000.true.real.attribute"/>
    <parameter key="61" value="61\.000.true.real.attribute"/>
    <parameter key="62" value="62\.000.true.real.attribute"/>
    <parameter key="63" value="63\.000.true.real.attribute"/>
    <parameter key="64" value="64\.000.true.real.attribute"/>
    <parameter key="65" value="65\.000.true.real.attribute"/>
    <parameter key="66" value="66\.000.true.real.attribute"/>
    <parameter key="67" value="67\.000.true.real.attribute"/>
    <parameter key="68" value="68\.000.true.real.attribute"/>
    <parameter key="69" value="69\.000.true.real.attribute"/>
    <parameter key="70" value="70\.000.true.real.attribute"/>
    <parameter key="71" value="71\.000.true.real.attribute"/>
    <parameter key="72" value="72\.000.true.real.attribute"/>
    <parameter key="73" value="73\.000.true.real.attribute"/>
    <parameter key="74" value="74\.000.true.real.attribute"/>
    <parameter key="75" value="75\.000.true.real.attribute"/>
    <parameter key="76" value="76\.000.true.real.attribute"/>
    <parameter key="77" value="77\.000.true.real.attribute"/>
    <parameter key="78" value="78\.000.true.real.attribute"/>
    <parameter key="79" value="79\.000.true.real.attribute"/>
    <parameter key="80" value="80\.000.true.real.attribute"/>
    <parameter key="81" value="81\.000.true.real.attribute"/>
    <parameter key="82" value="82\.000.true.real.attribute"/>
    <parameter key="83" value="83\.000.true.real.attribute"/>
    <parameter key="84" value="84\.000.true.real.attribute"/>
    <parameter key="85" value="85\.000.true.real.attribute"/>
    <parameter key="86" value="86\.000.true.real.attribute"/>
    <parameter key="87" value="87\.000.true.real.attribute"/>
    <parameter key="88" value="88\.000.true.real.attribute"/>
    <parameter key="89" value="89\.000.true.real.attribute"/>
    <parameter key="90" value="90\.000.true.real.attribute"/>
    <parameter key="91" value="91\.000.true.real.attribute"/>
    <parameter key="92" value="92\.000.true.real.attribute"/>
    <parameter key="93" value="93\.000.true.real.attribute"/>
    <parameter key="94" value="94\.000.true.real.attribute"/>
    <parameter key="95" value="95\.000.true.real.attribute"/>
    <parameter key="96" value="96\.000.true.real.attribute"/>
    <parameter key="97" value="97\.000.true.real.attribute"/>
    <parameter key="98" value="98\.000.true.real.attribute"/>
    <parameter key="99" value="99\.000.true.real.attribute"/>
    <parameter key="100" value="100\.000.true.real.attribute"/>
    <parameter key="101" value="101\.000.true.real.attribute"/>
    <parameter key="102" value="102\.000.true.real.attribute"/>
    <parameter key="103" value="103\.000.true.real.attribute"/>
    <parameter key="104" value="104\.000.true.real.attribute"/>
    <parameter key="105" value="105\.000.true.real.attribute"/>
    <parameter key="106" value="106\.000.true.real.attribute"/>
    <parameter key="107" value="107\.000.true.real.attribute"/>
    <parameter key="108" value="108\.000.true.real.attribute"/>
    <parameter key="109" value="109\.000.true.real.attribute"/>
    <parameter key="110" value="110\.000.true.real.attribute"/>
    <parameter key="111" value="111\.000.true.real.attribute"/>
    <parameter key="112" value="112\.000.true.real.attribute"/>
    <parameter key="113" value="113\.000.true.real.attribute"/>
    <parameter key="114" value="114\.000.true.real.attribute"/>
    <parameter key="115" value="115\.000.true.real.attribute"/>
    <parameter key="116" value="116\.000.true.real.attribute"/>
    <parameter key="117" value="117\.000.true.real.attribute"/>
    <parameter key="118" value="118\.000.true.real.attribute"/>
    <parameter key="119" value="119\.000.true.real.attribute"/>
    <parameter key="120" value="120\.000.true.real.attribute"/>
    <parameter key="121" value="121\.000.true.real.attribute"/>
    <parameter key="122" value="122\.000.true.real.attribute"/>
    <parameter key="123" value="123\.000.true.real.attribute"/>
    <parameter key="124" value="124\.000.true.real.attribute"/>
    <parameter key="125" value="125\.000.true.real.attribute"/>
    <parameter key="126" value="126\.000.true.real.attribute"/>
    <parameter key="127" value="127\.000.true.real.attribute"/>
    <parameter key="128" value="128\.000.true.real.attribute"/>
    <parameter key="129" value="Class.true.binominal.label"/>
    <parameter key="130" value="Age.true.integer.attribute"/>
    <parameter key="131" value="Height.true.integer.attribute"/>
    <parameter key="132" value="Weight.true.numeric.attribute"/>
    <parameter key="133" value="BMI.true.real.attribute"/>
    <parameter key="134" value="MP.true.integer.attribute"/>
    </list>
    </operator>
    <operator activated="true" class="concurrency:cross_validation" compatibility="8.0.001" expanded="true" height="145" name="Cross Validation" width="90" x="380" y="136">
    <process expanded="true">
    <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="8.0.001" expanded="true" height="103" name="Decision Tree" width="90" x="112" y="34"/>
    <connect from_port="training set" to_op="Decision Tree" to_port="training set"/>
    <connect from_op="Decision Tree" from_port="model" to_port="model"/>
    <portSpacing port="source_training set" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    <portSpacing port="sink_through 1" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="apply_model" compatibility="8.0.001" expanded="true" height="82" name="Apply Model (2)" width="90" x="112" y="34">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="performance_classification" compatibility="8.0.001" expanded="true" height="82" name="Performance" width="90" x="514" y="34">
    <list key="class_weights"/>
    </operator>
    <connect from_port="model" to_op="Apply Model (2)" to_port="model"/>
    <connect from_port="test set" to_op="Apply Model (2)" to_port="unlabelled data"/>
    <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
    <connect from_op="Performance" from_port="performance" to_port="performance 1"/>
    <portSpacing port="source_model" spacing="0"/>
    <portSpacing port="source_test set" spacing="0"/>
    <portSpacing port="source_through 1" spacing="0"/>
    <portSpacing port="sink_test set results" spacing="0"/>
    <portSpacing port="sink_performance 1" spacing="0"/>
    <portSpacing port="sink_performance 2" spacing="0"/>
    </process>
    </operator>
    <connect from_op="Read Excel (2)" from_port="output" to_op="Cross Validation" to_port="example set"/>
    <connect from_op="Cross Validation" from_port="model" to_port="result 1"/>
    <connect from_op="Cross Validation" from_port="example set" to_port="result 2"/>
    <connect from_op="Cross Validation" from_port="performance 1" to_port="result 3"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    <portSpacing port="sink_result 4" spacing="0"/>
    </process>
    </operator>
    </process>

    Screen Shot 2018-01-03 at 2.09.44 PM.png

     

    Scott

     

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member Posts: 1,968  Community Manager

    sounds good. FYI you don't need a completely balanced data set to perform analysis. We have some nice tools to help in that manner. I would much rather have a data set of a few thousand rows that is unbalanced than a data set less than 100 rows that is balanced.

     

    Scott

     

     

Sign In or Register to comment.