Options

Error on regexp

sennierersennierer Member Posts: 1 Contributor I
edited November 2018 in Help
I have a problem with the information extraction operator. I have a crawler that is loading webpages and then I use the process documents from data operator to process these websites. I use keep document parts and then the information extraction for getting a number, but the rexexp of the information extraction operator is always exiting with "Process Failed. No group 1".
These are the two operators:
<operator activated="true" class="text:keep_document_parts" compatibility="5.1.003" expanded="true" height="60" name="Keep Document Parts" width="90" x="84" y="30">
            <parameter key="extraction_regex" value="von\s+\d+\s+&lt;span\sclass=&quot;text1&quot;&gt;"/>
          </operator>
          <operator activated="true" class="text:extract_information" compatibility="5.1.003" expanded="true" height="60" name="Extract Information" width="90" x="447" y="75">
            <parameter key="query_type" value="Regular Expression"/>
            <list key="string_machting_queries"/>
            <parameter key="attribute_type" value="Numerical"/>
            <list key="regular_expression_queries">
              <parameter key="treffer" value="\s\d+\s"/>
            </list>
            <list key="regular_region_queries"/>
            <list key="xpath_queries"/>
            <list key="namespaces"/>
            <list key="index_queries"/>
          </operator>


The text that is coming to the second operator is something like: "von        17 <span class="text1"> "

I tested the regexp and normally it should work.
I would be thankful for any help!
Sign In or Register to comment.