RapidMiner

Newbie Noor
Newbie

[SOLVED] Expected data of type IOObject

I have created a new filter operator and added it inside the ProcessDocument operator, after Tokenize operator.

When I try to run the process, the following error occurs.
Process Failed
Wrong data of type of Document was delivered at port 'document'
Expected data of type IOObject

The xml of the ProcessDocument is
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.000">
 <context>
   <input/>
   <output/>
   <macros/>
 </context>
 <operator activated="true" class="process" compatibility="5.3.000" expanded="true" name="Process">
   <process expanded="true" height="-20" width="-50">
     <operator activated="true" class="text:read_document" compatibility="5.3.002" expanded="true" height="60" name="Read Document" width="90" x="112" y="75">
       <parameter key="file" value="D:\\RapidMiner\DataSets-DataMiningforthe Masses\Text.txt"/>
       <parameter key="encoding" value="UTF-8"/>
     </operator>
     <operator activated="true" class="text:process_documents" compatibility="5.3.002" expanded="true" height="94" name="Process Documents" width="90" x="380" y="75">
       <process expanded="true" height="228" width="1049">
         <operator activated="true" class="text:tokenize" compatibility="5.3.002" expanded="true" height="60" name="Tokenize" width="90" x="45" y="30"/>
         <operator activated="true" class="operator:filter_stopword_nlp" compatibility="1.0.000" expanded="true" height="60" name="Filter Stopwords (NLP)" width="90" x="179" y="30"/>
         <connect from_port="document" to_op="Tokenize" to_port="document"/>
         <connect from_op="Tokenize" from_port="document" to_op="Filter Stopwords (NLP)" to_port="document"/>
         <connect from_op="Filter Stopwords (NLP)" from_port="document" to_port="document 1"/>
         <portSpacing port="source_document" spacing="0"/>
         <portSpacing port="sink_document 1" spacing="0"/>
         <portSpacing port="sink_document 2" spacing="0"/>
       </process>
     </operator>
     <connect from_op="Read Document" from_port="output" to_op="Process Documents" to_port="documents 1"/>
     <connect from_op="Process Documents" from_port="example set" to_port="result 1"/>
     <portSpacing port="source_input 1" spacing="0"/>
     <portSpacing port="sink_result 1" spacing="0"/>
     <portSpacing port="sink_result 2" spacing="0"/>
   </process>
 </operator>
</process>


Any idea what can be wrong?
4 REPLIES
RM Staff
RM Staff

Re: Expected data of type IOObject

Hi,

your operator needs to return the type "com.rapidminer.operator.text.Document". Otherwise the output of your operator is not compatible to the expected input on the right side of the "Process Documents" operator. Make sure to not run into the duplicate class loading trap as described here.

Regards,
Marco
_________________________________________________________
Team Lead Software Engineering | RapidMiner GmbH
Newbie Noor
Newbie

Re: Expected data of type IOObject

Hello,

I have downloaded the source code of TextProcessing_Unuk and added it as a required Project, removed the rmx_text.jar file from the library folder.

And now I have successfully integrated my operator in RapidMiner.

Thanks a lot Sir.

Aniee
Learner I ahtsham58
Learner I

Re: Expected data of type IOObject

Hi Ainee

 

I am encountering  the same problem while tokenizing document


Noor wrote:
Hello,

I have downloaded the source code of TextProcessing_Unuk and added it as a required Project, removed the rmx_text.jar file from the library folder.

And now I have successfully integrated my operator in RapidMiner.

Thanks a lot Sir.

Aniee


 I get the same error.
Can you please elaborate the steps how did you fix it?

 

 

RM Staff
RM Staff

Re: Expected data of type IOObject

Hi ahtsham,

I think this might help with using the text extension as a depdendency: RapidProM-extension-issue

The process is the same, just exchange RapidProM extension with text extension. Smiley Happy

Cheers

Polls
How can RapidMiner increase participation in our new competitions?
Twitter Feed