I am setting up a radoop proxy on cloudera edge node, what should I put in the setting in radoop connection master address and port number? do I need to set anything in the server before I configure the radoop connection in the rapidminer studio?
If you have access to Cloudera Manager, the simplest way to create the connection is to use New Connection -> Import from Cluster Manager. After setting the required (highlighted) fields, you can enable Radoop Proxy and create it via the Edit button. The second best option if to use the (compressed) xml files exported from Cloudera Manager via New Connection -> Import Hadoop Configuration Files.
Basically, the connection should look similar to the case of having no Radoop Proxy, the Proxy can then solve the networking part. The hostname resolution may still need to work from the client that runs Studio. This could be solved by adding the master nodes to the OS hosts file and setting dfs.client.use.datanode.hostname to false in the Advanced Hadoop Parameters list.
RapidMiner Server needs to be running (Radoop Proxy authenticates to it), but no specific setting is required. Even the Radoop extension and the connection are only required on Server if processes will be submitted to the Server.