Options

problem with stopwordfilterfile

nguyenxuanhaunguyenxuanhau Member Posts: 22 Contributor II
edited November 2018 in Help
my file xml as:

<process version="4.6">

  <operator name="Root" class="Process" expanded="yes">
      <description text="Text Hau"/>
      <parameter key="logverbosity" value="init"/>
      <parameter key="random_seed" value="2001"/>
      <parameter key="send_mail" value="never"/>
      <parameter key="process_duration_for_mail" value="30"/>
      <parameter key="encoding" value="UTF-8"/>
      <operator name="TextInput" class="TextInput" expanded="yes">
          <list key="texts">
            <parameter key="graphics" value="dulieu"/>
          </list>
          <parameter key="default_content_type" value=""/>
          <parameter key="default_content_encoding" value="utf-8"/>
          <parameter key="default_content_language" value=""/>
          <parameter key="prune_below" value="-1"/>
          <parameter key="prune_above" value="-1"/>
          <parameter key="vector_creation" value="TermOccurrences"/>
          <parameter key="use_content_attributes" value="false"/>
          <parameter key="use_given_word_list" value="false"/>
          <parameter key="return_word_list" value="false"/>
          <parameter key="id_attribute_type" value="short"/>
          <list key="namespaces">
          </list>
          <parameter key="create_text_visualizer" value="false"/>
          <parameter key="on_the_fly_pruning" value="-1"/>
          <parameter key="extend_exampleset" value="false"/>
          <operator name="StringTokenizer" class="StringTokenizer">
          </operator>
          <operator name="StopwordFilterFile" class="StopwordFilterFile">
              <parameter key="file" value="dulieu/stopword.txt"/>
              <parameter key="case_sensitive" value="true"/>
          </operator>
      </operator>
  </operator>

</process>

when i run this file, it don't filter words that were encoded by utf-8

Answers

  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    if you switch to expert mode of RapidMiner in the parameters view, you will see that there is an encoding parameter. If you set this parameter to UTF-8 the process will work.

    Greetings,
    Sebastian
Sign In or Register to comment.