🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉

Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.

CLICK HERE TO DOWNLOAD

Designing a process to be used as a web service with parameters?

brandon_harrisbrandon_harris Member Posts: 4 Contributor I
edited November 2018 in Help

I was able to build a basic process, save it to the server, and run it as a web service (static data read in, model trained, scored results from the model are displayed). I'm thoroughly lost as to how I accept new data via url paramters for this service though. I've read through the KB on passing parameters to a service, but I think I'm missing the part in designing the process where I allow for new data (via these paramters).

 

Including a screenshot of the process I've built and deployed as a service on the RM server. The model takes a single variable at this time since I was testing this out. Let's call my variable / parameter "AGE". I understand that I define my url query paramter name (age) and bind it to a target macro or operator parameter. What operator do I need to include in my process so that I can bind my "age" url query parameter to it, so that I can pass new "age" values to my model and have the web service return a result from the model? I attempted to use the macro operator and pass that as my unlabeled exampleset to the 'apply model' operator, but that didnt' work.

 

To step back for a minute and look at the bigger picture, all I'm trying to do is train a model, and then publish that model as an API so that I can pass new data (in this case, the single age variable) to it, and have it return a scored / predicted result.

 

 On a separate note, the KB article "Passing parameters to Rapidminer Webservices" references a sample / tutorial project whice seems to answer my question, but I cannot find it on either the server or the local studio install. The tutorial process it uses to walk through this process can be seen in this image, though again I cannot find this tutorial or file location. - /t5/image/serverpage/image-id/114iADE1A705E2AC1534/image-size/large?v=v2&px=999

 

MikeLee

Best Answer

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751  RM Founder
    Solution Accepted

    Hi,

     

    There are in general threeoptions and which one is better depends on how many attributes you have trained your model on.  If it is less than 10 or so (and it sounds like this), the easiest way indeed is to use the URL parameters which can be transformed into macro values of your process.  Here are the basic steps (I guess you know most of them but just for other users who might stumble upon this later):

     

    1. Show the "Context" panel for your process in RapidMiner Studio
    2. Add a macro at the bottom of this panel, one for each attribute you need. Use some "realistic" data values as default for those macros.
    3. Start your process with an operator "Generate Data by User Specification"
    4. Create an attribute for each of the macros and use the macro value as attribute value (more about using macros: http://community.rapidminer.com/t5/RapidMiner-Studio-Knowledge-Base/How-to-Use-Macros/ta-p/32966)
    5. Make sure that the attribute names used in the data generation are exactly the same than those used in training the model.
    6. In the web service section of Server make sure that you bind URL parameters for each of the macros (which will become the attribute values in your process)
    7. Feed the generated data set into Apply Model using the model you built before.

    So it looks like you are pretty close and the only missing component is really the Generate Data by User Specification operator.

     

    If you have a LOT of attributes though, you can't really use this approach. In this case, the other two options are: 1) to pass over the input data as JSON or XML and start your application process with an operator to parse this data at the beginning in order to create and example set and 2) to use the post method of your web service to upload a file containing the data (shown in this older video: https://www.youtube.com/watch?v=AiTQtIRNIVo).

     

    Hope this helps,

    Ingo

    MikeLeebrandon_harris

Answers

  • brandon_harrisbrandon_harris Member Posts: 4 Contributor I

    Thank you! That worked great after I figured out the eval() function to return a numeric instead of a string!

     

    Any idea if this is a standard use case for RapidMiner? I plan on asking Sales, but was just curious. I did notice a bit more latency than I'd expect, I'm curious if this kind of service/model deployment is suitable for agressive SLA's, like <50ms response time.

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,077  RM Data Scientist

    Hi brandon,

     

    this kind of deployment is very typical. Vladimir reported on our last users conference that he set up a fraud scoring process for Yandex this way. You can do some more things like caching the model etc, but let's move this to the sales side.

     

     

    ~Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • kypexinkypexin Moderator, RapidMiner Certified Analyst, Member Posts: 290   Unicorn

    Hi everyone, 

     

    Somehow I have eventually come across this old topic, and yes, in Yandex we had this kind of RM server setup in production environment with models deployed as web services where an average web service response time was initially varying between 150-300ms and that was basically 'out of the box' without much optimization and even not using multithreading in MySQL. I also should note that the biggest bottleneck was not the model response time but obviously network data transfer times (between servers / services). 

     

    This said, very general advise is to pay most attention specificaly to network architecture so that an actual data used withing scoring models goes as short as possible between source, model and destination. Optimizing server configuration and database speed should be the second part. 

    sgenzer
Sign In or Register to comment.