I am looking for examples of processes that retrieve data from oAuth services like Twitter and Facebook APIs. I currently use Google Spreadsheets scripts to make HTTP requests but they have oAuth tools and urlFetch functions which I am not sure how to replicate in Rapidminer.
Grateful for any suggestions of where to start with this.
There are various methods, from using the RSS feeds in Twitter as a source (which I have seen several people post about doing directly in RapidMiner), or you could use an external package in R or another language...there are several Python implementations to query the Twitter APIs on Github.
Yep, there's R plugins to do it, or you can build it using Groovy script, or you can use RapidMiner directly. I did build the Twitter API in RapidMiner for a previous company using OAuth, (sadly I don't have all the code these days, but use OAuth pretty regularly for other services) it's actually really straightforward.
The basic idea is create your login credentials, pass this credential into a macro, put this in the header of your request to the service, receive your login token, read this login token (XML) and pass it into a macro, use the 'token macro' in the header of your requests.
One this to look out for is that the API only returns a set number of records so you'll need to use another macro inside the body of your request alongside Loop Until to ensure that you get all the data you want.
Hope that helps.
-- Training, Consulting, Sales in China, Hong Kong & Taiwan -- www.RapidMinerChina.com