🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉
Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.
Extract data from XML files
The xml structure is similar as follow:
<p>the main technologies 1...</p>
<p>the main technologies 2...</p>
<p>the main technologies 3...</p>
<p>the main technologies 4...</p>
<p>the main technologies 5...</p>
The xml file differences take place between <art-body> and </art-body>. Some xml files have four <section>, some have five..., the numbers of <p> in <section> tag also can be different. In addition, some xml files have not <subsect> contents, only have multiple <section> contents.
I want to extract <art-front> and <art-body> contents, but not <art-back> content.
I know that read xml operator can be used to extract content from xml file and also read document operator can finish it. Because my xml files are not totally same, I have no idea to deal with it. Is there any way to do that?