Options

"Multilabeling in Text Mining"

MUNISHVIRANGMUNISHVIRANG Member Posts: 9 Contributor II
edited May 2019 in Help
Dear All

I m trying to classify documents into predefined class using a SVM Learner.
I want to know weather rapid miner allow me to classify one document into multiple class .And if it is possible let me know who we can do it .Appreciate in advance.

<operator name="Root" class="Process" expanded="yes">
    <operator name="OperatorChain" class="OperatorChain" expanded="yes">
        <operator name="TextInput" class="TextInput" expanded="yes">
            <list key="texts">
              <parameter key="Price" value="C:\Documents and Settings\munish.virang\Desktop\XMX\BARCLAYSBANK\PRICE"/>
              <parameter key="Process" value="C:\Documents and Settings\munish.virang\Desktop\XMX\BARCLAYSBANK\PROCESS"/>
              <parameter key="Product" value="C:\Documents and Settings\munish.virang\Desktop\SAMPLE_DATA_SET\BARCLAYSBANK\PRODUCT"/>
              <parameter key="Promotion" value="C:\Documents and Settings\munish.virang\Desktop\SAMPLE_DATA_SET\BARCLAYSBANK\PROMOTION"/>
            </list>
            <parameter key="output_word_list" value="C:\Documents and Settings\munish.virang\Desktop\XMX\BARCLAYSBANK\words.list"/>
            <list key="namespaces">
            </list>
            <parameter key="create_text_visualizer" value="true"/>
            <operator name="StringTokenizer" class="StringTokenizer">
            </operator>
            <operator name="EnglishStopwordFilter" class="EnglishStopwordFilter">
            </operator>
            <operator name="TokenLengthFilter" class="TokenLengthFilter">
                <parameter key="min_chars" value="3"/>
            </operator>
            <operator name="LovinsStemmer" class="LovinsStemmer">
            </operator>
        </operator>
        <operator name="LibSVMLearner" class="LibSVMLearner">
            <parameter key="kernel_type" value="poly"/>
            <list key="class_weights">
            </list>
            <parameter key="calculate_confidences" value="true"/>
        </operator>
        <operator name="ModelWriter" class="ModelWriter">
            <parameter key="model_file" value="C:\Documents and Settings\munish.virang\Desktop\XMX\BARCLAYSBANK\SVM.mod"/>
        </operator>
    </operator>
    <operator name="OperatorChain (2)" class="OperatorChain" expanded="yes">
        <operator name="TextInput (2)" class="TextInput" expanded="no">
            <list key="texts">
              <parameter key="Price" value="C:\Documents and Settings\munish.virang\Desktop\XMX\BARCLAYSBANK\PRICE"/>
              <parameter key="Process" value="C:\Documents and Settings\munish.virang\Desktop\XMX\BARCLAYSBANK\PROCESS"/>
              <parameter key="Product" value="C:\Documents and Settings\munish.virang\Desktop\XMX\BARCLAYSBANK\PRODUCT"/>
              <parameter key="Promotion" value="C:\Documents and Settings\munish.virang\Desktop\XMX\BARCLAYSBANK\PROMOTION"/>
            </list>
            <parameter key="input_word_list" value="C:\Documents and Settings\munish.virang\Desktop\XMX\BARCLAYSBANK\words.list"/>
            <list key="namespaces">
            </list>
            <parameter key="create_text_visualizer" value="true"/>
            <operator name="StringTokenizer (2)" class="StringTokenizer">
            </operator>
            <operator name="EnglishStopwordFilter (2)" class="EnglishStopwordFilter">
            </operator>
            <operator name="TokenLengthFilter (2)" class="TokenLengthFilter">
                <parameter key="min_chars" value="3"/>
            </operator>
            <operator name="LovinsStemmer (2)" class="LovinsStemmer">
            </operator>
        </operator>
        <operator name="ModelLoader" class="ModelLoader">
            <parameter key="model_file" value="C:\Documents and Settings\munish.virang\Desktop\XMX\BARCLAYSBANK\SVM.mod"/>
        </operator>
        <operator name="ModelApplier" class="ModelApplier">
            <list key="application_parameters">
            </list>
        </operator>
    </operator>
    <operator name="ClassificationPerformance" class="ClassificationPerformance">
        <parameter key="main_criterion" value="classification_error"/>
        <parameter key="accuracy" value="true"/>
        <parameter key="classification_error" value="true"/>
        <parameter key="weighted_mean_recall" value="true"/>
        <parameter key="weighted_mean_precision" value="true"/>
        <list key="class_weights">
        </list>
    </operator>
</operator>

Answers

  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    thats possible, but needs a rather complex process setup. You could simply define several attributes containing a true or false for defining if an example is assigned to the associated class. You could assign these attributes roles like "label01", "label02" and so on. Together with the multiple label iterator, you could learn several SVM models, one per class.
    That should solve your problems, although making the apply process a little more complicated, because you would have to iterate manually over all models, load and apply them and everytime rename the old predicted attribute, so that it is not overwritten.

    I think, we really might need a Multilabel Meta Learner, solving both steps in one operator, making things much easier, but unfortunately it has a low priority in this moment.

    Greetings,
      Sebastian
Sign In or Register to comment.