Options

Get Pages

rfeigelrfeigel Member Posts: 18 Contributor II
edited November 2018 in Help
Is there a way to restrict get pages (or some other process) to just retrieving the first page from a website? I'm trying to compare the content of different pages from the same website. When get pages loads each of them, it drills through all of the links which skews the word count.

Answers

  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi,

    are you sure that you are using Get Pages and not Crawl Web? Get Pages only loads the exact link that you provide in the link attribute in the input data.

    Best regards,
    Marius
Sign In or Register to comment.