The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.
Options

Very easy problem...but I'm a newbie...

leon86itleon86it Member Posts: 2 Contributor I
edited November 2018 in Help
Hi everyone, and thanks a lot for another help you already gave to me.

I'm stuck in a specific situation:

I got a list of URL from a news website (www.parolibero.it). These URL are all the URL of news of my interest. I would like now to extract the text of the title+articles from these URL and store it locally (as text files).

I have read on the forum that I can use the loop example and inner operators "Extract Macro" and "Get Page". The problem is that the operator get page doesn't get anything as input but an URL...
how to set up dinamically that url by getting it from the list? (or if there's another operator...that's fine. Crawl web has the same prob and Get pages...gives me in Return just pages info, not the text)

Can someone help me in details?

Roughly I would like to create something that updates itself on every new article crawling...and extract it from that same (updated) list.

Thanks a lot and sorry for disturb. I just hope to be helpful for somebody in the future... :'(
Sign In or Register to comment.