K-NN

fedayncaricafedayncarica Member Posts: 30 Contributor I
edited November 2018 in Help

Good morning everybody. I'm an Italian Student and i have a problem with my process. How can I solve it? I'm desperated.. I attached my dataset. 

 

Have a good day, 

best regards, 

Damian

<?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
<operator activated="true" class="concurrency:loop_files" compatibility="7.4.000" expanded="true" height="82" name="Loop Files" width="90" x="313" y="34">
<parameter key="directory" value="C:\Users\Damiano\Desktop\csv"/>
<parameter key="filter_type" value="regex"/>
<parameter key="filter_by_regex" value=".*csv*."/>
<parameter key="recursive" value="false"/>
<parameter key="enable_macros" value="false"/>
<parameter key="macro_for_file_name" value="file_name"/>
<parameter key="macro_for_file_type" value="file_type"/>
<parameter key="macro_for_folder_name" value="folder_name"/>
<parameter key="reuse_results" value="true"/>
<parameter key="enable_parallel_execution" value="true"/>
<process expanded="true">
<operator activated="true" class="read_csv" compatibility="7.4.000" expanded="true" height="68" name="Read CSV" width="90" x="45" y="34">
<parameter key="csv_file" value="C:\Users\Damiano\Desktop\csv\5000IstanzeOM.csv"/>
<parameter key="column_separators" value=";"/>
<parameter key="trim_lines" value="false"/>
<parameter key="use_quotes" value="true"/>
<parameter key="quotes_character" value="&quot;"/>
<parameter key="escape_character" value="\"/>
<parameter key="skip_comments" value="false"/>
<parameter key="comment_characters" value="#"/>
<parameter key="parse_numbers" value="true"/>
<parameter key="decimal_character" value="."/>
<parameter key="grouped_digits" value="false"/>
<parameter key="grouping_character" value=","/>
<parameter key="date_format" value=""/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<parameter key="time_zone" value="SYSTEM"/>
<parameter key="locale" value="English (United States)"/>
<parameter key="encoding" value="windows-1252"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="graphsNumber,15-8016,16-10377,14-2797,15-1317,14-435,21-2,22-1,21-1,22-0,15-4855,label.true.polynominal.attribute"/>
</list>
<parameter key="read_not_matching_values_as_missings" value="true"/>
<parameter key="datamanagement" value="double_array"/>
<parameter key="data_management" value="auto"/>
</operator>
<operator activated="true" class="remap_binominals" compatibility="7.4.000" expanded="true" height="82" name="Remap Binominals" width="90" x="179" y="136">
<parameter key="attribute_filter_type" value="all"/>
<parameter key="attribute" value=""/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="binominal"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="binominal"/>
<parameter key="block_type" value="value_matrix_start"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_matrix_start"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="true"/>
<parameter key="negative_value" value="0"/>
<parameter key="positive_value" value="1"/>
</operator>
<operator activated="true" class="numerical_to_binominal" compatibility="7.4.000" expanded="true" height="82" name="Numerical to Binominal" width="90" x="313" y="136">
<parameter key="attribute_filter_type" value="all"/>
<parameter key="attribute" value=""/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="numeric"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="real"/>
<parameter key="block_type" value="value_series"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_series_end"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
<parameter key="min" value="0.0"/>
<parameter key="max" value="0.0"/>
</operator>
<operator activated="true" class="loop_parameters" compatibility="7.4.000" expanded="true" height="103" name="Loop Parameters" width="90" x="514" y="136">
<list key="parameters">
<parameter key="k-NN.k" value="[1.0;100.0;10;linear]"/>
</list>
<parameter key="error_handling" value="fail on error"/>
<parameter key="synchronize" value="false"/>
<process expanded="true">
<operator activated="true" class="set_role" compatibility="7.4.000" expanded="true" height="82" name="Set Role" width="90" x="112" y="34">
<parameter key="attribute_name" value="graphsNumber,15-8016,16-10377,14-2797,15-1317,14-435,21-2,22-1,21-1,22-0,15-4855,label"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="x_validation" compatibility="7.4.000" expanded="true" height="145" name="Validation" width="90" x="380" y="34">
<parameter key="create_complete_model" value="false"/>
<parameter key="average_performances_only" value="true"/>
<parameter key="leave_one_out" value="false"/>
<parameter key="number_of_validations" value="5"/>
<parameter key="sampling_type" value="stratified sampling"/>
<parameter key="use_local_random_seed" value="false"/>
<parameter key="local_random_seed" value="1992"/>
<process expanded="true">
<operator activated="true" class="k_nn" compatibility="7.4.000" expanded="true" height="82" name="k-NN" width="90" x="45" y="34">
<parameter key="k" value="1"/>
<parameter key="weighted_vote" value="false"/>
<parameter key="measure_types" value="MixedMeasures"/>
<parameter key="mixed_measure" value="MixedEuclideanDistance"/>
<parameter key="nominal_measure" value="NominalDistance"/>
<parameter key="numerical_measure" value="EuclideanDistance"/>
<parameter key="divergence" value="GeneralizedIDivergence"/>
<parameter key="kernel_type" value="radial"/>
<parameter key="kernel_gamma" value="1.0"/>
<parameter key="kernel_sigma1" value="1.0"/>
<parameter key="kernel_sigma2" value="0.0"/>
<parameter key="kernel_sigma3" value="2.0"/>
<parameter key="kernel_degree" value="3.0"/>
<parameter key="kernel_shift" value="1.0"/>
<parameter key="kernel_a" value="1.0"/>
<parameter key="kernel_b" value="0.0"/>
</operator>
<connect from_port="training" to_op="k-NN" to_port="training set"/>
<connect from_op="k-NN" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" class="apply_model" compatibility="7.1.001" expanded="true" height="82" name="Apply Model" width="90" x="45" y="30">
<list key="application_parameters"/>
<parameter key="create_view" value="false"/>
</operator>
<operator activated="true" class="performance" compatibility="7.4.000" expanded="true" height="82" name="Performance" width="90" x="179" y="34">
<parameter key="use_example_weights" value="true"/>
</operator>
<operator activated="true" class="performance_to_data" compatibility="7.4.000" expanded="true" height="82" name="Performance to Data" width="90" x="45" y="165"/>
<operator activated="true" class="write_csv" compatibility="7.4.000" expanded="true" height="82" name="Write CSV" width="90" x="179" y="238">
<parameter key="csv_file" value="C:\Users\Damiano\Desktop\performance_knn.csv"/>
<parameter key="column_separator" value=";"/>
<parameter key="write_attribute_names" value="true"/>
<parameter key="quote_nominal_values" value="true"/>
<parameter key="format_date_attributes" value="true"/>
<parameter key="append_to_file" value="true"/>
<parameter key="encoding" value="SYSTEM"/>
</operator>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_op="Performance to Data" to_port="performance vector"/>
<connect from_op="Performance to Data" from_port="example set" to_op="Write CSV" to_port="input"/>
<connect from_op="Performance to Data" from_port="performance vector" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
<portSpacing port="sink_averagable 3" spacing="0"/>
</process>
</operator>
<connect from_port="input 1" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Validation" to_port="training"/>
<connect from_op="Validation" from_port="averagable 1" to_port="result 1"/>
<connect from_op="Validation" from_port="averagable 2" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
<connect from_port="file object" to_op="Read CSV" to_port="file"/>
<connect from_op="Read CSV" from_port="output" to_op="Remap Binominals" to_port="example set input"/>
<connect from_op="Remap Binominals" from_port="example set output" to_op="Numerical to Binominal" to_port="example set input"/>
<connect from_op="Numerical to Binominal" from_port="example set output" to_op="Loop Parameters" to_port="input 1"/>
<connect from_op="Loop Parameters" from_port="result 1" to_port="output 1"/>
<portSpacing port="source_file object" spacing="0"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
</operator>
</process>

Screenshot (36).png

Best Answer

  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,993 RM Engineering
    Solution Accepted

    Hi,

     

    Two problems:

     

    1. Your CSV file contains "," as the separator, but in the "Read CSV" operator, you define ";" as the separator. Thus your data only has one column because it does not split on each comma. Change the "column separators" parameter of the "Read CSV" operator to "," and it will read it correctly
    2. You still need to tell RapidMiner Studio that the attribute called "label" is actually a label column. Use the "Set Role" operator for that which you can add directly before the "Loop Parameters" operator. Type in "label" for the "attribute name" parameter and set the "target role" parameter to "label" as well.

     

    Regards,

    Marco

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    The XML code you pasted is not importing for me. Can you use export the RMP?  Just go to File > Export Process, and attach that.

     

    Also, please don't create new threads if you already started one on this particular topic. 

  • fedayncaricafedayncarica Member Posts: 30 Contributor I

    Hi Thomas, ok.. i send you my rmp!

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    What is your label? Did you use the Read CSV import wizard to load your data in? 

  • fedayncaricafedayncarica Member Posts: 30 Contributor I

    Good Morning, yes, i used the Read Csv import wizard..

  • fedayncaricafedayncarica Member Posts: 30 Contributor I

    Hi Marco and thank you very very much for your support. I have change the process with your adds. I have another problem.. I will attache another picture for you! Thank you!

    Screenshot (37).png

  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,993 RM Engineering

    Hi,

     

    as the title of that message implies, it's a possible problem. Because this is inside a Loop Files operator and after a Read CSV operator, we just don't know what we will get. The assumption is an empty data set, therefore this warning is displayed. You can safely ignore that, unless you have CSV files which are empty ;)

     

    Regards,

    Marco

  • fedayncaricafedayncarica Member Posts: 30 Contributor I

    Thank you Marco! Are you italian? 

Sign In or Register to comment.