"READ XML operator: how to handle repeating tags?"

Fran_ois-Paul_SFran_ois-Paul_S Member Posts: 2 Contributor I
edited June 2019 in Help

consider the following XML file, where each "example" ("RECORD" tag) may contain zero, one or more "KEYWORD" sub-tags:

<?xml version="1.0" encoding="UTF-8"?>

    <TEXT>blah blah etc</TEXT>
    <KEYWORD>This is kw2</KEYWORD>
    <TEXT>other blah</TEXT>

How can I handle the repeating "KEYWORD" tag?

<parameter key="xpath_for_examples" value="//RECORD"/>
<parameter key="xpath_for_attribute" value="KEYWORD/text()"/>

I get for the keyword attribute of the first record: "kw1This is kw2"

I tried to add a separator using the XPATH 2.0 expression:
string-join(KEYWORD, ";")
but this doesn't seem to be supported

Is it possible to get a properly "keywords" attribute (that I could later handle with text processing tools, or with the "split" operator)?



PS Here's the complete process:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.013">
  <operator activated="true" class="process" compatibility="5.3.013" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="read_xml" compatibility="5.3.013" expanded="true" height="60" name="Read XML" width="90" x="45" y="30">
        <parameter key="file" value="/Users/fps/_fps/Data/XMLTest.xml"/>
        <parameter key="xpath_for_examples" value="//RECORD"/>
        <enumeration key="xpaths_for_attributes">
          <parameter key="xpath_for_attribute" value="ID/text()"/>
          <parameter key="xpath_for_attribute" value="TEXT/text()"/>
          <parameter key="xpath_for_attribute" value="KEYWORD/text()"/>
        <list key="namespaces"/>
        <list key="annotations"/>
        <list key="data_set_meta_data_information"/>
      <connect from_op="Read XML" from_port="output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>

Sign In or Register to comment.