Options

[SOLVED]Problem with "Extract Information" operator turning numerical into polyn

JohanJohan Member Posts: 2 Contributor I
I have a problem with the "extract information" operator. I'm parsing a html to fetch a numerical value using a regexp. When looking at the output I can see that it has fetched the correct numbers. However, the type of the attribute has changed from numerical to polynomial. Anybody got a suggestion for a solution to this problem? I paste the xml of my operator below, so you can see if I'm messing something up in the process.
<operator activated="true" class="text:extract_information" compatibility="5.1.003" expanded="true" height="60" name="Extract Comments" width="90" x="112" y="120">
           <parameter key="query_type" value="Regular Expression"/>
           <list key="string_machting_queries">
             <parameter key="Comments" value="&lt;a href=&quot;#article-comments&quot; class=&quot;comments&quot; rel=&quot;nofollow&quot;&gt;. kommentarer&lt;/a&gt;"/>
           </list>
           <parameter key="attribute_type" value="Numerical"/>
           <list key="regular_expression_queries">
             <parameter key="Comments" value=".*class=&quot;comments&quot;.*&gt;([0-9]+).*kommentarer\&lt;\/a&gt;"/>
           </list>
           <list key="regular_region_queries"/>
           <list key="xpath_queries"/>
           <list key="namespaces"/>
           <list key="index_queries"/>
         </operator>

Answers

  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi Johan,

    it seems you have found a bug. You can work around it with the Parse Numbers operator.

    Kind regards,
    Marius
  • Options
    JohanJohan Member Posts: 2 Contributor I
    Thank you. I suspected as much. I will try the work around.
Sign In or Register to comment.