Web Crawl Memory Management
I tried to crawl a fairly large site recently and despite giving my rapid miner process 1024M of heap memory, the app eventually crashed with a memory exception. I was using the Crawl Web component. I think what I probably need to do is store my urls to follow in a database or on a file rather than trying to hold all that in memory. Does the Process Documents from Web component enable that functionality? It seems like it pretty much has the same link following functionality capability as the Crawl Web component. Can someone confirm?