"Extract information from weblog (how to handle 31 text files for 3GB)"
I am going to extract the IP and agent information from 31 files which is zipped (around 320MB)
steps as follows,
1 ) unzipped to 3GB text file (seems zipped file cannot be read by rapidminer ???)
2 ) use read server log process ( it works fine for a little files only,
It seems that the process read all files into RAM , but 3 GB text file cannot be handled well.....
3) Process : store to repository
4) Process : aggregate
5) Process : export to CSV
can anyone give me tips please ;D