RapidMiner

K-NN

SOLVED
Regular Contributor

K-NN

Good morning everybody. I'm an Italian Student and i have a problem with my process. How can I solve it? I'm desperated.. I attached my dataset. 

 

Have a good day, 

best regards, 

Damian

<?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
  <operator activated="true" class="concurrency:loop_files" compatibility="7.4.000" expanded="true" height="82" name="Loop Files" width="90" x="313" y="34">
    <parameter key="directory" value="C:\Users\Damiano\Desktop\csv"/>
    <parameter key="filter_type" value="regex"/>
    <parameter key="filter_by_regex" value=".*csv*."/>
    <parameter key="recursive" value="false"/>
    <parameter key="enable_macros" value="false"/>
    <parameter key="macro_for_file_name" value="file_name"/>
    <parameter key="macro_for_file_type" value="file_type"/>
    <parameter key="macro_for_folder_name" value="folder_name"/>
    <parameter key="reuse_results" value="true"/>
    <parameter key="enable_parallel_execution" value="true"/>
    <process expanded="true">
      <operator activated="true" class="read_csv" compatibility="7.4.000" expanded="true" height="68" name="Read CSV" width="90" x="45" y="34">
        <parameter key="csv_file" value="C:\Users\Damiano\Desktop\csv\5000IstanzeOM.csv"/>
        <parameter key="column_separators" value=";"/>
        <parameter key="trim_lines" value="false"/>
        <parameter key="use_quotes" value="true"/>
        <parameter key="quotes_character" value="&quot;"/>
        <parameter key="escape_character" value="\"/>
        <parameter key="skip_comments" value="false"/>
        <parameter key="comment_characters" value="#"/>
        <parameter key="parse_numbers" value="true"/>
        <parameter key="decimal_character" value="."/>
        <parameter key="grouped_digits" value="false"/>
        <parameter key="grouping_character" value=","/>
        <parameter key="date_format" value=""/>
        <parameter key="first_row_as_names" value="false"/>
        <list key="annotations">
          <parameter key="0" value="Name"/>
        </list>
        <parameter key="time_zone" value="SYSTEM"/>
        <parameter key="locale" value="English (United States)"/>
        <parameter key="encoding" value="windows-1252"/>
        <list key="data_set_meta_data_information">
          <parameter key="0" value="graphsNumber,15-8016,16-10377,14-2797,15-1317,14-435,21-2,22-1,21-1,22-0,15-4855,label.true.polynominal.attribute"/>
        </list>
        <parameter key="read_not_matching_values_as_missings" value="true"/>
        <parameter key="datamanagement" value="double_array"/>
        <parameter key="data_management" value="auto"/>
      </operator>
      <operator activated="true" class="remap_binominals" compatibility="7.4.000" expanded="true" height="82" name="Remap Binominals" width="90" x="179" y="136">
        <parameter key="attribute_filter_type" value="all"/>
        <parameter key="attribute" value=""/>
        <parameter key="attributes" value=""/>
        <parameter key="use_except_expression" value="false"/>
        <parameter key="value_type" value="binominal"/>
        <parameter key="use_value_type_exception" value="false"/>
        <parameter key="except_value_type" value="binominal"/>
        <parameter key="block_type" value="value_matrix_start"/>
        <parameter key="use_block_type_exception" value="false"/>
        <parameter key="except_block_type" value="value_matrix_start"/>
        <parameter key="invert_selection" value="false"/>
        <parameter key="include_special_attributes" value="true"/>
        <parameter key="negative_value" value="0"/>
        <parameter key="positive_value" value="1"/>
      </operator>
      <operator activated="true" class="numerical_to_binominal" compatibility="7.4.000" expanded="true" height="82" name="Numerical to Binominal" width="90" x="313" y="136">
        <parameter key="attribute_filter_type" value="all"/>
        <parameter key="attribute" value=""/>
        <parameter key="attributes" value=""/>
        <parameter key="use_except_expression" value="false"/>
        <parameter key="value_type" value="numeric"/>
        <parameter key="use_value_type_exception" value="false"/>
        <parameter key="except_value_type" value="real"/>
        <parameter key="block_type" value="value_series"/>
        <parameter key="use_block_type_exception" value="false"/>
        <parameter key="except_block_type" value="value_series_end"/>
        <parameter key="invert_selection" value="false"/>
        <parameter key="include_special_attributes" value="false"/>
        <parameter key="min" value="0.0"/>
        <parameter key="max" value="0.0"/>
      </operator>
      <operator activated="true" class="loop_parameters" compatibility="7.4.000" expanded="true" height="103" name="Loop Parameters" width="90" x="514" y="136">
        <list key="parameters">
          <parameter key="k-NN.k" value="[1.0;100.0;10;linear]"/>
        </list>
        <parameter key="error_handling" value="fail on error"/>
        <parameter key="synchronize" value="false"/>
        <process expanded="true">
          <operator activated="true" class="set_role" compatibility="7.4.000" expanded="true" height="82" name="Set Role" width="90" x="112" y="34">
            <parameter key="attribute_name" value="graphsNumber,15-8016,16-10377,14-2797,15-1317,14-435,21-2,22-1,21-1,22-0,15-4855,label"/>
            <parameter key="target_role" value="label"/>
            <list key="set_additional_roles"/>
          </operator>
          <operator activated="true" class="x_validation" compatibility="7.4.000" expanded="true" height="145" name="Validation" width="90" x="380" y="34">
            <parameter key="create_complete_model" value="false"/>
            <parameter key="average_performances_only" value="true"/>
            <parameter key="leave_one_out" value="false"/>
            <parameter key="number_of_validations" value="5"/>
            <parameter key="sampling_type" value="stratified sampling"/>
            <parameter key="use_local_random_seed" value="false"/>
            <parameter key="local_random_seed" value="1992"/>
            <process expanded="true">
              <operator activated="true" class="k_nn" compatibility="7.4.000" expanded="true" height="82" name="k-NN" width="90" x="45" y="34">
                <parameter key="k" value="1"/>
                <parameter key="weighted_vote" value="false"/>
                <parameter key="measure_types" value="MixedMeasures"/>
                <parameter key="mixed_measure" value="MixedEuclideanDistance"/>
                <parameter key="nominal_measure" value="NominalDistance"/>
                <parameter key="numerical_measure" value="EuclideanDistance"/>
                <parameter key="divergence" value="GeneralizedIDivergence"/>
                <parameter key="kernel_type" value="radial"/>
                <parameter key="kernel_gamma" value="1.0"/>
                <parameter key="kernel_sigma1" value="1.0"/>
                <parameter key="kernel_sigma2" value="0.0"/>
                <parameter key="kernel_sigma3" value="2.0"/>
                <parameter key="kernel_degree" value="3.0"/>
                <parameter key="kernel_shift" value="1.0"/>
                <parameter key="kernel_a" value="1.0"/>
                <parameter key="kernel_b" value="0.0"/>
              </operator>
              <connect from_port="training" to_op="k-NN" to_port="training set"/>
              <connect from_op="k-NN" from_port="model" to_port="model"/>
              <portSpacing port="source_training" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true">
              <operator activated="true" class="apply_model" compatibility="7.1.001" expanded="true" height="82" name="Apply Model" width="90" x="45" y="30">
                <list key="application_parameters"/>
                <parameter key="create_view" value="false"/>
              </operator>
              <operator activated="true" class="performance" compatibility="7.4.000" expanded="true" height="82" name="Performance" width="90" x="179" y="34">
                <parameter key="use_example_weights" value="true"/>
              </operator>
              <operator activated="true" class="performance_to_data" compatibility="7.4.000" expanded="true" height="82" name="Performance to Data" width="90" x="45" y="165"/>
              <operator activated="true" class="write_csv" compatibility="7.4.000" expanded="true" height="82" name="Write CSV" width="90" x="179" y="238">
                <parameter key="csv_file" value="C:\Users\Damiano\Desktop\performance_knn.csv"/>
                <parameter key="column_separator" value=";"/>
                <parameter key="write_attribute_names" value="true"/>
                <parameter key="quote_nominal_values" value="true"/>
                <parameter key="format_date_attributes" value="true"/>
                <parameter key="append_to_file" value="true"/>
                <parameter key="encoding" value="SYSTEM"/>
              </operator>
              <connect from_port="model" to_op="Apply Model" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
              <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
              <connect from_op="Performance" from_port="performance" to_op="Performance to Data" to_port="performance vector"/>
              <connect from_op="Performance to Data" from_port="example set" to_op="Write CSV" to_port="input"/>
              <connect from_op="Performance to Data" from_port="performance vector" to_port="averagable 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_averagable 1" spacing="0"/>
              <portSpacing port="sink_averagable 2" spacing="0"/>
              <portSpacing port="sink_averagable 3" spacing="0"/>
            </process>
          </operator>
          <connect from_port="input 1" to_op="Set Role" to_port="example set input"/>
          <connect from_op="Set Role" from_port="example set output" to_op="Validation" to_port="training"/>
          <connect from_op="Validation" from_port="averagable 1" to_port="result 1"/>
          <connect from_op="Validation" from_port="averagable 2" to_port="result 2"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="source_input 2" spacing="0"/>
          <portSpacing port="sink_performance" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
        </process>
      </operator>
      <connect from_port="file object" to_op="Read CSV" to_port="file"/>
      <connect from_op="Read CSV" from_port="output" to_op="Remap Binominals" to_port="example set input"/>
      <connect from_op="Remap Binominals" from_port="example set output" to_op="Numerical to Binominal" to_port="example set input"/>
      <connect from_op="Numerical to Binominal" from_port="example set output" to_op="Loop Parameters" to_port="input 1"/>
      <connect from_op="Loop Parameters" from_port="result 1" to_port="output 1"/>
      <portSpacing port="source_file object" spacing="0"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_output 1" spacing="0"/>
      <portSpacing port="sink_output 2" spacing="0"/>
    </process>
  </operator>
</process>

Screenshot (36).png

Attachments

8 REPLIES
Community Manager

Re: K-NN

[ Edited ]

The XML code you pasted is not importing for me. Can you use export the RMP?  Just go to File > Export Process, and attach that.

 

Also, please don't create new threads if you already started one on this particular topic. 

Regards,
T-Bone
Twitter: @neuralmarket
Regular Contributor

Re: K-NN

Hi Thomas, ok.. i send you my rmp!

Attachments

Community Manager

Re: K-NN

What is your label? Did you use the Read CSV import wizard to load your data in? 

Regards,
T-Bone
Twitter: @neuralmarket
Regular Contributor

Re: K-NN

Good Morning, yes, i used the Read Csv import wizard..

Moderator

Re: K-NN

Hi,

 

Two problems:

 

  1. Your CSV file contains "," as the separator, but in the "Read CSV" operator, you define ";" as the separator. Thus your data only has one column because it does not split on each comma. Change the "column separators" parameter of the "Read CSV" operator to "," and it will read it correctly
  2. You still need to tell RapidMiner Studio that the attribute called "label" is actually a label column. Use the "Set Role" operator for that which you can add directly before the "Loop Parameters" operator. Type in "label" for the "attribute name" parameter and set the "target role" parameter to "label" as well.

 

Regards,

Marco

_________________________________________________________
Team Lead Software Engineering | RapidMiner GmbH
Highlighted
Regular Contributor

Re: K-NN

Hi Marco and thank you very very much for your support. I have change the process with your adds. I have another problem.. I will attache another picture for you! Thank you!

Screenshot (37).png

Moderator

Re: K-NN

Hi,

 

as the title of that message implies, it's a possible problem. Because this is inside a Loop Files operator and after a Read CSV operator, we just don't know what we will get. The assumption is an empty data set, therefore this warning is displayed. You can safely ignore that, unless you have CSV files which are empty Smiley Wink

 

Regards,

Marco

_________________________________________________________
Team Lead Software Engineering | RapidMiner GmbH
Regular Contributor

Re: K-NN

Thank you Marco! Are you italian?