select attribute and save

platanas20platanas20 Member Posts: 22 Contributor II
edited November 2018 in Help
Hello to everyone,
I want to save in differents txt files some comments which are in an excel file. I use the operator "read excel" , "select attribute" , "write as a text" .
My code is:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.008">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.1.008" expanded="true" name="Process">
    <parameter key="encoding" value="UTF-8"/>
    <parameter key="parallelize_main_process" value="true"/>
    <process expanded="true" height="280" width="547">
      <operator activated="true" class="read_excel" compatibility="5.1.008" expanded="true" height="60" name="Read Excel" width="90" x="65" y="66">
        <parameter key="excel_file" value="C:\Users\elenious\Desktop\diplomatiki\newresults\ypes_comments_98.xls"/>
        <parameter key="imported_cell_range" value="G1:G1362"/>
        <parameter key="first_row_as_names" value="false"/>
        <list key="annotations">
          <parameter key="0" value="Name"/>
        </list>
        <list key="data_set_meta_data_information">
          <parameter key="0" value="Σχόλιο.true.polynominal.attribute"/>
        </list>
      </operator>
      <operator activated="true" class="select_attributes" compatibility="5.1.008" expanded="true" height="76" name="Select Attributes" width="90" x="246" y="75">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="Σχόλιο"/>
        <parameter key="attributes" value="|Σχόλιο"/>
        <parameter key="value_type" value="text"/>
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <operator activated="true" class="write_as_text" compatibility="5.1.008" expanded="true" height="76" name="Write as Text" width="90" x="447" y="210">
        <parameter key="result_file" value="C:\Users\elenious\Desktop\asbeo comte\comments_%{a}.txt"/>
        <parameter key="encoding" value="UTF-8"/>
      </operator>
      <connect from_op="Read Excel" from_port="output" to_op="Select Attributes" to_port="example set input"/>
      <connect from_op="Select Attributes" from_port="example set output" to_op="Write as Text" to_port="input 1"/>
      <connect from_op="Write as Text" from_port="input 1" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
As a result i take one txt file with all the comments.What i must to do to save the comments in differents files?

Thanks!!!

Answers

  • colocolo Member Posts: 236 Maven
    Hi platanas,

    if each comment is a single line in your excel file, you can easily use the "Loop Examples" operator and put the file writing inside. This allows you to write one file per example.

    I fear that some of your settings are messed up, but this should show the relevant changes:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.1.009">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="5.1.009" expanded="true" name="Process">
       <parameter key="encoding" value="UTF-8"/>
       <process expanded="true" height="296" width="547">
         <operator activated="true" class="read_excel" compatibility="5.1.009" expanded="true" height="60" name="Read Excel" width="90" x="65" y="66">
           <parameter key="excel_file" value="C:\Users\elenious\Desktop\diplomatiki\newresults\ypes_comments_98.xls"/>
           <parameter key="imported_cell_range" value="G1:G1362"/>
           <parameter key="first_row_as_names" value="false"/>
           <list key="annotations">
             <parameter key="0" value="Name"/>
           </list>
           <list key="data_set_meta_data_information">
             <parameter key="0" value="??????.true.polynominal.attribute"/>
           </list>
         </operator>
         <operator activated="true" class="select_attributes" compatibility="5.1.009" expanded="true" height="76" name="Select Attributes" width="90" x="246" y="75">
           <parameter key="attribute_filter_type" value="single"/>
           <parameter key="attribute" value="??????"/>
           <parameter key="attributes" value="|??????"/>
           <parameter key="value_type" value="text"/>
           <parameter key="include_special_attributes" value="true"/>
         </operator>
         <operator activated="true" class="loop_examples" compatibility="5.1.009" expanded="true" height="76" name="Loop Examples" width="90" x="447" y="75">
           <process expanded="true" height="607" width="761">
             <operator activated="true" class="filter_example_range" compatibility="5.1.009" expanded="true" height="76" name="Filter Example Range" width="90" x="45" y="120">
               <parameter key="first_example" value="%{example}"/>
               <parameter key="last_example" value="%{example}"/>
             </operator>
             <operator activated="true" class="write_as_text" compatibility="5.1.009" expanded="true" height="76" name="Write as Text" width="90" x="179" y="30">
               <parameter key="result_file" value="C:\Users\elenious\Desktop\asbeo comte\comments_%{example}.txt"/>
               <parameter key="encoding" value="UTF-8"/>
             </operator>
             <connect from_port="example set" to_op="Filter Example Range" to_port="example set input"/>
             <connect from_op="Filter Example Range" from_port="example set output" to_op="Write as Text" to_port="input 1"/>
             <connect from_op="Filter Example Range" from_port="original" to_port="example set"/>
             <portSpacing port="source_example set" spacing="0"/>
             <portSpacing port="sink_example set" spacing="108"/>
             <portSpacing port="sink_output 1" spacing="0"/>
           </process>
         </operator>
         <connect from_op="Read Excel" from_port="output" to_op="Select Attributes" to_port="example set input"/>
         <connect from_op="Select Attributes" from_port="example set output" to_op="Loop Examples" to_port="example set"/>
         <connect from_op="Loop Examples" from_port="example set" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
    </process>
    Regards
    Matthias
  • platanas20platanas20 Member Posts: 22 Contributor II
    Hi Matthias,

    Yes each comment is a single line in my excel file, this code works fine (it creates txt files), but in txts files saved this

    "31.08.2011 20:13:24 Results of ResultWriter 'Write as Text' [1]:
    31.08.2011 20:13:24 SplittedExampleSet:
    1 examples,
    1 regular attributes,
    no special attributes"

    What i must to do to save my comments?

    Thanks
  • colocolo Member Posts: 236 Maven
    Hi Platanas,

    I never used "Write as Text" and don't know what it should do. It seems to deliver only the description of the example set.

    I guess you should be able to use "Write Special Format" instead, but I am also not familiar with this operator. I used macros and document operators to write contents of single attributes to files earlier.

    Something like the following should work:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.1.009">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="5.1.009" expanded="true" name="Process">
       <parameter key="encoding" value="UTF-8"/>
       <process expanded="true" height="296" width="547">
         <operator activated="true" class="read_excel" compatibility="5.1.009" expanded="true" height="60" name="Read Excel" width="90" x="45" y="30">
           <parameter key="excel_file" value="C:\Users\elenious\Desktop\diplomatiki\newresults\ypes_comments_98.xls"/>
           <parameter key="imported_cell_range" value="G1:G1362"/>
           <parameter key="first_row_as_names" value="false"/>
           <list key="annotations">
             <parameter key="0" value="Name"/>
           </list>
           <list key="data_set_meta_data_information">
             <parameter key="0" value="??????.true.polynominal.attribute"/>
           </list>
         </operator>
         <operator activated="true" class="select_attributes" compatibility="5.1.009" expanded="true" height="76" name="Select Attributes" width="90" x="179" y="30">
           <parameter key="attribute_filter_type" value="single"/>
           <parameter key="attribute" value="??????"/>
           <parameter key="attributes" value="|??????"/>
           <parameter key="value_type" value="text"/>
           <parameter key="include_special_attributes" value="true"/>
         </operator>
         <operator activated="true" class="loop_examples" compatibility="5.1.009" expanded="true" height="76" name="Loop Examples" width="90" x="313" y="30">
           <process expanded="true" height="607" width="761">
             <operator activated="true" class="filter_example_range" compatibility="5.1.009" expanded="true" height="76" name="Filter Example Range" width="90" x="45" y="120">
               <parameter key="first_example" value="%{example}"/>
               <parameter key="last_example" value="%{example}"/>
             </operator>
             <operator activated="true" class="extract_macro" compatibility="5.1.009" expanded="true" height="60" name="Extract Macro" width="90" x="179" y="30">
               <parameter key="macro" value="fileContent"/>
               <parameter key="macro_type" value="data_value"/>
               <parameter key="attribute_name" value="??????"/>
               <parameter key="example_index" value="1"/>
             </operator>
             <operator activated="true" class="text:create_document" compatibility="5.1.001" expanded="true" height="60" name="Create Document" width="90" x="313" y="30">
               <parameter key="text" value="%{fileContent}"/>
             </operator>
             <operator activated="true" class="text:write_document" compatibility="5.1.001" expanded="true" height="60" name="Write Document" width="90" x="447" y="30">
               <parameter key="file" value="C:\Users\elenious\Desktop\asbeo comte\comments_%{example}.txt"/>
             </operator>
             <connect from_port="example set" to_op="Filter Example Range" to_port="example set input"/>
             <connect from_op="Filter Example Range" from_port="example set output" to_op="Extract Macro" to_port="example set"/>
             <connect from_op="Filter Example Range" from_port="original" to_port="example set"/>
             <connect from_op="Create Document" from_port="output" to_op="Write Document" to_port="document"/>
             <portSpacing port="source_example set" spacing="0"/>
             <portSpacing port="sink_example set" spacing="108"/>
             <portSpacing port="sink_output 1" spacing="0"/>
           </process>
         </operator>
         <connect from_op="Read Excel" from_port="output" to_op="Select Attributes" to_port="example set input"/>
         <connect from_op="Select Attributes" from_port="example set output" to_op="Loop Examples" to_port="example set"/>
         <connect from_op="Loop Examples" from_port="example set" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
    </process>
    Best regards
    Matthias
  • platanas20platanas20 Member Posts: 22 Contributor II

    Hi Matthias,

    Yes "Extract Macro" works right!!!!!

    Thank you so much!!!!!  ;D  ;D  ;D
  • platanas20platanas20 Member Posts: 22 Contributor II

    Hi again,
    i have one question, how can the txts files have encoding UTF-8  ???

    Thanks
  • colocolo Member Posts: 236 Maven
    Hi,

    this is not possible at the moment with "Write Document", since this operator uses the Java class FileWriter. This always uses the system's default encoding. So the encoding will depend on the operating system used.

    If you are familiar with Java programming, you should be able to replace this writer with an alternative, that let's you specify the encoding to use. If not, you might post this as feature request and hope this will be implemented in the future. Since this is a really simple task once you are familar with RapidMiner coding, I might also change this for you. But since I'm really close to the deadline for submitting my thesis, you would also have to wait about three weeks until I will have the time for changing this...

    Regards
    Matthias
  • platanas20platanas20 Member Posts: 22 Contributor II
    Hi again,

    I created a project in netbeans where i add the rapidminer operators exactly as the xml :
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.1.009">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.1.009" expanded="true" name="Process">
        <parameter key="encoding" value="UTF-8"/>
        <process expanded="true" height="296" width="547">
          <operator activated="true" class="read_excel" compatibility="5.1.009" expanded="true" height="60" name="Read Excel" width="90" x="45" y="30">
            <parameter key="excel_file" value="C:\Users\elenious\Desktop\diplomatiki\newresults\ypes_comments_98.xls"/>
            <parameter key="imported_cell_range" value="G1:G1362"/>
            <parameter key="first_row_as_names" value="false"/>
            <list key="annotations">
              <parameter key="0" value="Name"/>
            </list>
            <list key="data_set_meta_data_information">
              <parameter key="0" value="??????.true.polynominal.attribute"/>
            </list>
          </operator>
          <operator activated="true" class="select_attributes" compatibility="5.1.009" expanded="true" height="76" name="Select Attributes" width="90" x="179" y="30">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="??????"/>
            <parameter key="attributes" value="|??????"/>
            <parameter key="value_type" value="text"/>
            <parameter key="include_special_attributes" value="true"/>
          </operator>
          <operator activated="true" class="loop_examples" compatibility="5.1.009" expanded="true" height="76" name="Loop Examples" width="90" x="313" y="30">
            <process expanded="true" height="607" width="761">
              <operator activated="true" class="filter_example_range" compatibility="5.1.009" expanded="true" height="76" name="Filter Example Range" width="90" x="45" y="120">
                <parameter key="first_example" value="%{example}"/>
                <parameter key="last_example" value="%{example}"/>
              </operator>
              <operator activated="true" class="extract_macro" compatibility="5.1.009" expanded="true" height="60" name="Extract Macro" width="90" x="179" y="30">
                <parameter key="macro" value="fileContent"/>
                <parameter key="macro_type" value="data_value"/>
                <parameter key="attribute_name" value="??????"/>
                <parameter key="example_index" value="1"/>
              </operator>
              <operator activated="true" class="text:create_document" compatibility="5.1.001" expanded="true" height="60" name="Create Document" width="90" x="313" y="30">
                <parameter key="text" value="%{fileContent}"/>
              </operator>
              <operator activated="true" class="text:write_document" compatibility="5.1.001" expanded="true" height="60" name="Write Document" width="90" x="447" y="30">
                <parameter key="file" value="C:\Users\elenious\Desktop\asbeo comte\comments_%{example}.txt"/>
              </operator>
              <connect from_port="example set" to_op="Filter Example Range" to_port="example set input"/>
              <connect from_op="Filter Example Range" from_port="example set output" to_op="Extract Macro" to_port="example set"/>
              <connect from_op="Filter Example Range" from_port="original" to_port="example set"/>
              <connect from_op="Create Document" from_port="output" to_op="Write Document" to_port="document"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="sink_example set" spacing="108"/>
              <portSpacing port="sink_output 1" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Read Excel" from_port="output" to_op="Select Attributes" to_port="example set input"/>
          <connect from_op="Select Attributes" from_port="example set output" to_op="Loop Examples" to_port="example set"/>
          <connect from_op="Loop Examples" from_port="example set" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    But when i run it (via netbeans) ,it creates txt files which all contains only the first comment (the first row of excel). In RapidMiner it works fine. Does anyone know how can i solve it?
Sign In or Register to comment.