RapidMiner 9.8 Beta is now available
Be one of the first to get your hands on the new features. More details and downloads here:
Get Pages operator - possible enhancements
I am not sure if this is the right place to post this, but we have encountered two minor issues with the Get Page and Get Pages operators that are part of the Web extension.
1) When the remote web server returns an invalid encoding (uft-16) or an empty one, the operators throw an exception.
Some sample URLs are: www.ochoa.es, www.mrw.es, www.giraud.es, www.alartec.com
It would be great if the user could select a default encoding, and in case the web server returns an invalid one, the default gets used.
2) Some web servers don't properly return the EOF when serving a page, and even though I believe the operators are able to read the page's content, an exception is thrown when trying to read the EOF.
A sample URL: http://www.jamonescarretero.com
It would be great if the operators could identify this situation, and not throw an exception.