"Crawler Proxy"
We are trying to datamine some websites. The internal Crawler doesnt seem to have any proxy settings also the auto update doesnt function ofcourse. Now i was wondering if there is a Proxy setting hidden somewhere which i can set.
I tried HTTrack for Crawling, but it seems the latest version is broken, you cant use exlusions, they just dont work. (Confirmed on the HTTrack forums)
Now i fetched some data manualy to test the RapidMiner software, but maybe you guys know a solution or maybe another Craweler i could try?
Thanks in advance
I tried HTTrack for Crawling, but it seems the latest version is broken, you cant use exlusions, they just dont work. (Confirmed on the HTTrack forums)
Now i fetched some data manualy to test the RapidMiner software, but maybe you guys know a solution or maybe another Craweler i could try?
Thanks in advance
Tagged:
0