RapidMiner

‎03-31-2017 08:13 AM

This article deals with JSON files.

 

For a CSV upload, please see:

http://community.rapidminer.com/t5/RapidMiner-Server-Knowledge-Base/Scoring-a-whole-file-with-a-POST...

 For XML:

http://community.rapidminer.com/t5/RapidMiner-Server-Knowledge-Base/How-to-send-an-XML-data-file-to-...

 

When we want to sent a whole list of records as a file or as a json data stream to the server to get scored, or for training; we need a POST call.

 

Before we get started on this, it may be a bit daunting finding out how to do a POST call because it is not as simple as passing a single value through a web browser and a lot of articles out there approach making POST calls through programming languages. We will use cURL, a command line tool for sending and receiving files through URL syntax. It has many parameters but we only need two for now, further details can be found by typing "curl -h" in the command line or through https://en.wikipedia.org/wiki/CURL

 

Let us get started with a very simple process which trains a decision tree to classify customers. It is meant to be able to predict the "Response" attribute.

 

daily demo.png

A sample of the data is as follows:

daily demo data.png

 

For deployment, we have another process which imports the model built above on the data we have with known responses and applies it to data with unknown responses.

 

json score.png 

 

Please note how the "Read Document" operator is connected to the imput port of the process canvas. This is so that it can receive data through the POST upload from the service we are going to create. This operator is not currently pointed to any file.

 

There is also a "Cut Document" operator which is necessary for partitioning the JSON file to JSON documents per record so that the "JSON to Data" operator can parse them. The "JSON to Data" operator is meant to ingest collections of JSON documents, not a single file, that is why we cut the document into a collection using the following parameters:

 

json cut doc parameters.png

Please note that in steps 1 to 4 above, we use the JsonPath notation to parse the file and the "$." argument. With other files, this argument can get more complicated.

It may be worth the time to research JsonPath notation online as well as regular expressions.

 

Before we create the service, bear in mind that the data we are going to pass for scoring does not have values in the response attribute, those values are going to be predicted.

 

daily demo data no response.png

 

Let us now create the service in Rapidminer Server:

 

create service.png

 

There are no special parameters or macros, as long as it points to the correct process for deployment. We then test the service to get the URL which, in this case, is:

http://RMUK-KBONIKOS:8080/api/rest/process/POSTtest?

 

We can use curl from the command line, which in Windows can be started from the Start menu by typing "cmd" and then we can enter the follwoing curl command:

 

curl --user admin --upload-file C:\test.json http://<your server>:8080/api/rest/process/POSTtest?

 

Or,  curl --user admin --data {json data} http://<your server>:8080/api/rest/process/POSTtest?

 

The user in this case is "admin" and the file is test.json and is saved in the C: drive, the path for the file can change. When this is executed, you will be asked for your password for the user and then you should get a list of scored records in XML in this case, this depends on the output format chosen when defining the service.

result.png

 

With the above help, you should be able to create a process and expose it as a service in Rapidminer Server and then test that is is able to receive JSON data or files through a POST upload and score and return the output.

Comments
RM Certified Expert
RM Certified Expert

OoOo, this is nice. Thank you!