Create XML FIles From Master XML

neodjandreneodjandre Member Posts: 1 Contributor I
edited November 2018 in Help
Hey all, newbie alert! Just came across Rapid Miner today and I must say it's an excellent tool.

I have a big XML file which contains 30,000 records. There is an element <city> </city> for each record.

I would like to extract all records with the city name <city>London</city> and create a separate XML file for this.

Any ideas would be much appreciated!



  • Options
    Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,996 RM Engineering

    you can use the process below. Just make sure to change the filenames in the Read XML and Write Document operators to something matching your file system.

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.3.007">
      <operator activated="true" class="process" compatibility="5.3.007" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="read_xml" compatibility="5.3.007" expanded="true" height="60" name="Read XML" width="90" x="45" y="30">
            <parameter key="file" value="C:\Users\username\Desktop\test.xml"/>
            <parameter key="xpath_for_examples" value="//city"/>
            <enumeration key="xpaths_for_attributes">
              <parameter key="xpath_for_attribute" value="node()"/>
            <list key="namespaces"/>
            <parameter key="use_default_namespace" value="false"/>
            <list key="annotations"/>
            <list key="data_set_meta_data_information">
              <parameter key="0" value="att1.true.binominal.attribute"/>
          <operator activated="true" class="set_macro" compatibility="5.3.007" expanded="true" height="76" name="Set Macro" width="90" x="179" y="30">
            <parameter key="macro" value="i"/>
            <parameter key="value" value="0"/>
          <operator activated="true" class="loop_values" compatibility="5.3.007" expanded="true" height="76" name="Loop Values" width="90" x="313" y="30">
            <parameter key="attribute" value="att1"/>
            <process expanded="true">
              <operator activated="true" class="text:create_document" compatibility="5.3.001" expanded="true" height="60" name="Create Document" width="90" x="45" y="30">
                <parameter key="text" value="%{loop_value}"/>
              <operator activated="true" class="text:write_document" compatibility="5.3.001" expanded="true" height="76" name="Write Document" width="90" x="179" y="30">
                <parameter key="file" value="C:\Users\username\Desktop\output%{i}.xml"/>
              <operator activated="true" class="set_macro" compatibility="5.3.007" expanded="true" height="76" name="Set Macro (2)" width="90" x="313" y="30">
                <parameter key="macro" value="i"/>
                <parameter key="value" value="%{a}"/>
              <connect from_op="Create Document" from_port="output" to_op="Write Document" to_port="document"/>
              <connect from_op="Write Document" from_port="document" to_op="Set Macro (2)" to_port="through 1"/>
              <connect from_op="Set Macro (2)" from_port="through 1" to_port="out 1"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="sink_out 1" spacing="0"/>
              <portSpacing port="sink_out 2" spacing="0"/>
          <connect from_op="Read XML" from_port="output" to_op="Set Macro" to_port="through 1"/>
          <connect from_op="Set Macro" from_port="through 1" to_op="Loop Values" to_port="example set"/>
          <connect from_op="Loop Values" from_port="out 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
Sign In or Register to comment.