🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉
Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.
Article categorization from web
My name is Andrea, i'm trying to to get "all the articles" in the "repubblica.it" home page, then i have to categorize them.
For the first part (access to the articles) i thought it was useful to access the content of <p> tags of the page (www.repubblica.it).
I mean, i chosed to use the operators Crawl Web and Enrich Data by Webservice (to access via XPATH to the meaning content). I setted the Enrich operator with an xpath query (attribute name=Article, query expression =//h:p), but i receive as output a file with the entire page (not the portion i need) as if the xpath query doesn't have effect. Did i choose them wrongly or anything else?
Can someone help me, please?
If possible, i'd like to post here the XML code of the project: can i?