Information Extraction - TextVisualizer Problem

jcurry1jcurry1 Member Posts: 24 Contributor II
edited November 2018 in Help
Hi Folks,

I have an issue with the TextVisualizer component that is part of the Information Extraction plugin.  When I try and run any of the sample processes for the plugin (downloadable from link below) that use the TextVisualiser I get an error such as below.

Anyone familiar with the Information Extraction plugin around to help?  I am using the latest RapidMiner: 5.2.008 and the latest Information Extraction plugin on SourceForge.  The other Information Extraction  components seem to work fine so far.

[tt]Dec 31, 2012 4:15:01 PM WARNING: Error creating renderer: java.lang.ClassCastException: com.rapidminer.operator.visualization.TextVisualizer cannot be cast to com.rapidminer.operator.visualization.TextVisualizer[/tt]

Sample processes were downloaded from:

http://ieplugin4rm.svn.sourceforge.net/viewvc/ieplugin4rm/informationextractionplugin_Vega/trunk/informationExtraction_Vega/samples/

http://ieplugin4rm.svn.sourceforge.net/viewvc/ieplugin4rm/informationextractionplugin_Vega/trunk/informationExtraction_Vega/samples/

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.008">
 <context>
   <input/>
   <output/>
   <macros/>
 </context>
 <operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
   <process expanded="true" height="540" width="815">
     <operator activated="true" class="text:read_document" compatibility="5.2.004" expanded="true" height="60" name="Read Document" width="90" x="45" y="30">
       <parameter key="file" value="C:\Documents and Settings\jcurry\My Documents\A_AAAAAA\samples\toyText.txt"/>
     </operator>
     <operator activated="true" class="text:documents_to_data" compatibility="5.2.004" expanded="true" height="76" name="Documents to Data" width="90" x="179" y="30">
       <parameter key="text_attribute" value="text"/>
     </operator>
     <operator activated="true" class="informationExtraction:sentence_tokenizer" compatibility="1.0.000" expanded="true" height="76" name="SentenceTokenizer" width="90" x="45" y="165">
       <parameter key="attribute" value="text"/>
       <parameter key="new token-name" value="sentence"/>
     </operator>
     <operator activated="true" class="informationExtraction:word_tokenizer" compatibility="1.0.000" expanded="true" height="76" name="WordTokenizer" width="90" x="179" y="165">
       <parameter key="attribute" value="sentence"/>
       <parameter key="new token-name" value="word"/>
     </operator>
     <operator activated="true" class="informationExtraction:text_visualizer" compatibility="1.0.000" expanded="true" height="76" name="TextVisualizer" width="90" x="313" y="165">
       <parameter key="text-attribute" value="word"/>
       <parameter key="label-attribute" value="sentence"/>
     </operator>
     <operator activated="true" class="select_attributes" compatibility="5.2.008" expanded="true" height="76" name="Select Attributes" width="90" x="45" y="300">
       <parameter key="attribute_filter_type" value="subset"/>
       <parameter key="attributes" value="batch|word"/>
       <parameter key="include_special_attributes" value="true"/>
     </operator>
     <operator activated="true" class="store" compatibility="5.2.008" expanded="true" height="60" name="Store" width="90" x="179" y="300">
       <parameter key="repository_entry" value="./ToyTextTokenized"/>
     </operator>
     <connect from_op="Read Document" from_port="output" to_op="Documents to Data" to_port="documents 1"/>
     <connect from_op="Documents to Data" from_port="example set" to_op="SentenceTokenizer" to_port="example set input"/>
     <connect from_op="SentenceTokenizer" from_port="example set output" to_op="WordTokenizer" to_port="example set input"/>
     <connect from_op="WordTokenizer" from_port="example set output" to_op="TextVisualizer" to_port="example set input"/>
     <connect from_op="TextVisualizer" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
     <connect from_op="TextVisualizer" from_port="text visualizer port" to_port="result 1"/>
     <connect from_op="Select Attributes" from_port="example set output" to_op="Store" to_port="input"/>
     <connect from_op="Store" from_port="through" to_port="result 2"/>
     <portSpacing port="source_input 1" spacing="0"/>
     <portSpacing port="sink_result 1" spacing="54"/>
     <portSpacing port="sink_result 2" spacing="18"/>
     <portSpacing port="sink_result 3" spacing="0"/>
   </process>
 </operator>
</process>
Regards,
John.



Answers

  • AsuntoAsunto Member Posts: 1 Contributor I
    Hello All,

    here the same problem, unfortunately...
    I use the same examples.

    Jan 6, 2013 8:57:21 PM INFO: Process //VoorbeeldenInformationExtractionPlugin/1_Read-Tokenize-Visualize starts
    Jan 6, 2013 8:57:21 PM INFO: Loading initial data.
    Jan 6, 2013 8:57:22 PM INFO: Saving results.
    Jan 6, 2013 8:57:22 PM INFO: Process //VoorbeeldenInformationExtractionPlugin/1_Read-Tokenize-Visualize finished successfully after 0 s
    Jan 6, 2013 8:57:22 PM WARNING: Error creating renderer: java.lang.ClassCastException: com.rapidminer.operator.visualization.TextVisualizer cannot be cast to com.rapidminer.operator.visualization.TextVisualizer

    Can anybody help? I'm very interested in getting the plugin going correctly...

    Regards,
    Kees.
  • jcurry1jcurry1 Member Posts: 24 Contributor II
    I am waiting to hear back from Mr. Jungerman, the creator of the component when he finds some time to look at the issue..  In the meantime, he mentioned that it may work successfully with an older version of RapidMiner, the ones built from the Vega build.  So perhaps it is worth going back... Not sure which RapidMiner release matches up with that - maybe one of the 4.x releases available below.  I will try that.

    If anyone has success with a specific older version, please update this thread with the version number.
     
    http://sourceforge.net/projects/rapidminer/files/1.%20RapidMiner/
  • aborgaborg Member Posts: 66 Contributor II
    Vega was 5.0.
  • fnlfnl Member Posts: 4 Contributor I
    Here is how to set up the Information Extraction plugin for RapidMiner correctly and avoid the problems and the reported CCE.
    Note that the IE plugin on the marketplace is outdated and should not be used; Instead, you need to manually install the plugin.
    I can confirm this procedure works for RM 6.2, and I suppose it should also for the 5.x series, given the plugin has not been updated since.

    1. Install the IE plugin jar; download it from:

    http://sourceforge.net/projects/ieplugin4rm/

    Then, copy the jar to $RAPIDMINER_HOME/lib/plugins
    where $RAPIDMINER_HOME is the base directory where RapidMiner itself is installed.

    2. Download the examples with

    svn checkout https://svn.code.sf.net/p/ieplugin4rm/code/informationextractionplugin_Vega/trunk/informationExtraction_Vega/samples/

    Then, install them by placing them into a repository of yours ("MyRepo"):

    cp -r samples ~/.RapidMiner/repositories/MyRepo/IE_samples

    Note that you have to first create your own repository within RapidMiner, "MyRepo" does not exist per se!
    Open your repository tree (here "MyRepo") in RapidMiner, an you are done!

    You can load and run the examples. Note that you might need to fix/set the paths of the text "databases", but this should be trivial.
  • ZeiniNahedZeiniNahed Member Posts: 1 Contributor I

    I am trying to use the information extraction using rapidminer 8, but there is no result. I have searched more to know how to extract certian sentance that includes certain words but no results. Is there any way to do that. 

    this is the only reference I have found: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.645.7232&rep=rep1&type=pdf

    I need help.

    2017-12-23.pngThanks in advance.

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hello @ZeiniNahed - so that extension has not been supported in a long time (see here). I would strongly recommend using our very popular text processing extension instead.

     

    Scott

     

     

Sign In or Register to comment.