W-Apriori doesn't work. need help

edfrededfred Member Posts: 8 Contributor II
edited November 2018 in Help
Hi at all,

i want to use the W-Apriori operator to generate some association rules, but it's not working.
I am using the rapidminer version 4.3.
This is my operatorchain:

root
|  |
|  |-Textinput
|    |
|    |-StringTokenizer
|    |-GermanStopwordFilter
|    |-ToLowerCaseConverter
|    |-TokenLengthFGilter
|
|-ExampleSetWriter
|
|-W-Apriori

If I press the start-button, there is a an exception like this:

Error: 905 External Error
Error in: W-Apriori (W-Apriori) W-Apriori caused an error: weka.core.UnsupportedAttributeTypeException: weka.associations.Apriori: Cannot handle numeric attributes! An external program or library has reported an error. Please see the documentation of this program or library for further information.

How can I get binary attributes. I think I have to converte them somehow.

Can youo give me an example operator chain, where it's works?

Best regards
edfred

Answers

  • earmijoearmijo Member Posts: 270 Unicorn
    If you are using "Binary Ocurrences" as your Vector Creation choice, you will have a matrix of 0/1s. You still have to transform it into a matrix of true/false which is the input form accepted by the Asociators like Weka-apriori.  You can do this with the Numerical2Binomial converter (Preprocessing/Attributes/Filter/Converter/...).
    <operator name="Root" class="Process" expanded="yes">
        <operator name="TextInput" class="TextInput" expanded="yes">
            <parameter key="attributes" value=""/>
            <parameter key="create_text_visualizer" value="true"/>
            <parameter key="default_content_encoding" value="ISO-8859-1"/>
            <list key="namespaces">
            </list>
            <parameter key="on_the_fly_pruning" value="3"/>
            <parameter key="prune_below" value="2"/>
            <list key="texts">
              <parameter key="graphics" value="../data/newsgroup/graphics"/>
              <parameter key="hardware" value="../data/newsgroup/hardware"/>
            </list>
            <parameter key="vector_creation" value="BinaryOccurrences"/>
            <operator name="StringTokenizer" class="StringTokenizer">
            </operator>
            <operator name="EnglishStopwordFilter" class="EnglishStopwordFilter">
            </operator>
            <operator name="TokenLengthFilter" class="TokenLengthFilter">
                <parameter key="min_chars" value="3"/>
            </operator>
        </operator>
        <operator name="Numerical2Binominal" class="Numerical2Binominal">
        </operator>
        <operator name="W-Apriori" class="W-Apriori">
        </operator>
    </operator>
  • edfrededfred Member Posts: 8 Contributor II
    Thank you that was very helpfl. It works now but the German words aren't displayed in the right way. Like the letters "ä", "ö", "ü" and "ß". Where can I set the enccoding to utf-8 ?
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    this can be switched in the Textinput operator. The parameter is called "default_encoding" or something like that.

    Greetings,
      Sebastian
  • edfrededfred Member Posts: 8 Contributor II
    Hi,

    I tried this:

    <operator name="Root" class="Process" expanded="yes">
        <operator name="TextInput" class="TextInput" expanded="yes">
            <parameter key="attributes" value=""/>
            <parameter key="create_text_visualizer" value="true"/>
            <parameter key="default_content_encoding" value="UTF-8"/>
            <list key="namespaces">
            </list>
            <parameter key="on_the_fly_pruning" value="3"/>
            <parameter key="prune_below" value="2"/>
            <list key="texts">
              <parameter key="test" value="../rm_workspace/apriori/test"/>
            </list>
            <parameter key="vector_creation" value="BinaryOccurrences"/>
            <operator name="ToLowerCaseConverter" class="ToLowerCaseConverter">
            </operator>
            <operator name="StringTokenizer" class="StringTokenizer">
            </operator>
            <operator name="GermanStopwordFilter" class="GermanStopwordFilter">
            </operator>
            <operator name="TokenLengthFilter" class="TokenLengthFilter">
                <parameter key="min_chars" value="3"/>
            </operator>
        </operator>
        <operator name="Numerical2Binominal" class="Numerical2Binominal">
        </operator>
        <operator name="W-Apriori" class="W-Apriori">
        </operator>
    </operator>
    But this is not working. Rapidminer freezes after 5 minutes. I tried it with this:
    java -Xms128M -Xmx1024M -jar rapidminer.jar
    But Rapidminer still freeze. And I have to close the whole program.
    If I use the default encoding (I let the space empty.), it's working. But it's not displaying the german letters.
    Do you know why?

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    unfortunatly I don't have any clue, why this should happen. And I can't test it without the data.
    Did you wait a few minutes before closing rapidMiner? Some parts of the TextMiningPlugin somehow manage to block the gui thread. But the gui thread recovers if the calculation has been finished.

    Greetings,
      Sebastian
  • edfrededfred Member Posts: 8 Contributor II
    I was waiting a long time, but nevertheless the program was blocked and I have to abort it.
Sign In or Register to comment.