RapidMiner

RapidMiner

Problem in Performance Operator

Contributor II

Problem in Performance Operator

Hello,

I am loading a model saved in a model file and applying it to a textinput. After the model applier, I have a performance operator.


The Performance operator (and other related operators like BinomialClassificationPerformance and ClassificationPerformance) gives a Nullpointer exception.  The problem does not seem to be with my corpus. I am also adding the appropriate output word list. 

9 REPLIES
Regular Contributor

Re: Problem in Performance Operator

Hello adityajo

Please post the complete process here ... otherwise it is too painful to reconstruct the error. Best you post a repeatable process by using a datagenerator-operator instead of your own input. This would be very helpful.

greetings,

steffen
Contributor II

Re: Problem in Performance Operator

Hi Steffen,

Thanks a lot for the reply.

Details of the process:

TextInput > ModelLoader > ModelApplier > Performance.


TextInput takes the input files on which the model is to be applied. The files are arranged in two directories each corresponding to one of the two class labels.

ModelLoader loads the model saved as a 'mod' file in an earlier experiment.

ModelApplier applies the loaded model

Performance is expected to give the classification accuracy and other parameters




The above process gives a
"  [Fatal] NullPointerException occured in 1st application of Performance (Performance) "


I tried replacing the Performance operator with ClassificationPerformance and other operators in that category. But that does not seem to help.


Is there anything else that I need to describe to you?

Regards,
Aditya
Elite

Re: Problem in Performance Operator

Hi,
I would suggest to set a break point after the model application and check what the results look like. Usually this already gives you a hint where the problem lies.

Greetings,
  Sebastian
Old World Computing - Establishing the Future

Check out the Jackhammer Extension for RapidMiner! Crunch more data easier and with up to 700% speed up! Available only here

Contributor II

Re: Problem in Performance Operator

Hi Sebastien,



Yes, I had done that. The classifier labels were generated. It gave a Null pointer exception in the first run of Performance operator.

Regards,
Aditya
Elite

Re: Problem in Performance Operator

Hi,
without any further information I cannot say anything about this. If you could post the process and exchange the data source by a data generation operator, then I would be able to trace the problem down.

Greetings,
  Sebastian
Old World Computing - Establishing the Future

Check out the Jackhammer Extension for RapidMiner! Crunch more data easier and with up to 700% speed up! Available only here

Contributor II

Re: Problem in Performance Operator

Hi,

Details of the problem. Please address my problem..


Errors :

G Apr 5, 2010 1:50:49 PM: [Fatal] ArrayIndexOutOfBoundsException occured in 1st application of BinominalClassificationPerformance (BinominalClassificationPerformance)
G Apr 5, 2010 1:50:49 PM: [Fatal] Process failed: operator cannot be executed (-1). Check the log messages...


The input directories:

D:\Sem 4\R & D\poseng1contains only one file whose contents are:

By supporting people's unbounded given befitting reply.



D:\Sem 4\R & D\negeng1 contains only one file whose contents are:

Perhaps the audience does not love them as much.



Note: I tried working with the actual dataset (which is much larger in size) too. But I get the same error.



As I see it, the class labels are generated but the numbers cannot be produced.

Please help me with this.

Thanks & Regards,
Aditya



The XML file:

------------------------------------------------
<operator name="Root" class="Process" expanded="yes">
    <operator name="TextInput" class="TextInput" expanded="yes">
        <list key="texts">
          <parameter key="POS" value="D:\Sem 4\R &amp; D\poseng1"/>
          <parameter key="NEG" value="D:\Sem 4\R &amp; D\negeng1"/>
        </list>
        <parameter key="input_word_list" value="C:\Documents and Settings\Aaditya\My Documents\rm_workspace\debrajWL.txt"/>
        <list key="namespaces">
        </list>
        <operator name="StringTokenizer" class="StringTokenizer">
        </operator>
    </operator>
    <operator name="ModelLoader" class="ModelLoader">
        <parameter key="model_file" value="C:\Documents and Settings\Aaditya\My Documents\rm_workspace\debraj.mod"/>
    </operator>
    <operator name="ModelApplier" class="ModelApplier">
        <list key="application_parameters">
        </list>
    </operator>
    <operator name="BinominalClassificationPerformance" class="BinominalClassificationPerformance">
        <parameter key="precision" value="true"/>
        <parameter key="recall" value="true"/>
    </operator>
</operator>


---------------------------
Regular Contributor

Re: Problem in Performance Operator

Hello  adityajo

Seems like there is a bug in the code or (more probably) an invalid input has not been caught (and there are a lot of invalid inputs...). Without your files (data and model) we will not be able to track down the problem. If you are not able to share the data (e.g. because it is confidential), then replace the real data with some generated via an operator from Utility->Data Generation. Otherwise we are restricted to guessing, and seriously, we all have simply too much to do to provide this amount of support.

beside:
1. Is "TextInput" an officially released operator ? If this is true, where can I find it ?
2. Do you use rapidminer 4.6 or rapidminer 5 ?

regards,

Steffen
Contributor II

Re: Problem in Performance Operator

(1) Yes, TextInput is. Documentation at : http://nemoz.org/joomla/mining/wvtool/javadoc/com/rapidminer/operator/TextInput.html

(2) I use RapidMiner 4.6.
Regular Contributor

Re: Problem in Performance Operator

ok

Here is another idea:
Please write the output of modelapplier to a file (e.g. using examplesetwriter or csvwriter), but only the special ones (i.e. all non-regular fields).  This way you wont see any confidential data.

upload the data to a place of your choice and post the link here. If this is not possible for any reason, send the file to my email address.

regards,

steffen

edit: please use examplesetwriter ...otherwise the metadata is lost.
edit2: what do you mean above with "numbers cannot be produced" ? Does that mean that the confidence attribute is all unknown ? In this case you do not have to post data. In this case it would be more interesting to see your model building process (the process, which generated the *.mod-file).