I am looking for examples of processes that retrieve data from oAuth services like Twitter and Facebook APIs. I currently use Google Spreadsheets scripts to make HTTP requests but they have oAuth tools and urlFetch functions which I am not sure how to replicate in Rapidminer.
Grateful for any suggestions of where to start with this.
There are various methods, from using the RSS feeds in Twitter as a source (which I have seen several people post about doing directly in RapidMiner), or you could use an external package in R or another language...there are several Python implementations to query the Twitter APIs on Github.
Yep, there's R plugins to do it, or you can build it using Groovy script, or you can use RapidMiner directly. I did build the Twitter API in RapidMiner for a previous company using OAuth, (sadly I don't have all the code these days, but use OAuth pretty regularly for other services) it's actually really straightforward.
The basic idea is create your login credentials, pass this credential into a macro, put this in the header of your request to the service, receive your login token, read this login token (XML) and pass it into a macro, use the 'token macro' in the header of your requests.
One this to look out for is that the API only returns a set number of records so you'll need to use another macro inside the body of your request alongside Loop Until to ensure that you get all the data you want.