RapidMiner

random user agent for web mining needs an update

Status: Open For Voting

Hi guys, not a real bug in itself, but it appears there are some fairly old random agents in the scripts used, and this causes problems time by time when using the crawler. The behaviour is noticeable by sites using a browser version validator, and then forward you to a page mentioning 'hey buddy, looks like your browser could use an update' or similar.

 

I start to notice this more and more lately, and while unchecking the 'random user agent' solves the problem it was a nice feature to have.

 

Any plans to bring a new version, including some new browser /agent logic and cleaning out the old ones?

3 Comments (3 New)
Comments
Community Manager
Status: Declined

Hi @kayman - unfortunately the whole Web Mining extension needs a complete overhaul and I do not see it on the immediate future roadmap. I'm going to push this to Product Ideas so people can upvote if they feel this is a priority.

 

Scott

 

Community Manager
Status: Open For Voting
 
RM Certified Expert

There are definitely several other more critial bugs in the web mining extension, but this should get added to the list.  I do hope the developers are able to look at improving it all sometime soon!