Model Management/ Model Comparison/ Model tracking

bhupendra_patilbhupendra_patil Administrator, Employee, Member Posts: 168 RM Data Scientist
edited November 2018 in Help

 

Today we released a new extension as part of RapidMiner labs initiative to provide easy way to compare models, track model performance over period of time as well as automatic replacement of models if newer build models get better replacements.

 

This can achieved by using the new "Compare Models" operator, you can download in studio from Marketplace from studio
Search for "Model Management"

or from here.

https://marketplace.rapidminer.com/UpdateServer/faces/product_details.xhtml?productId=rmx_model_management

 

You can get started with the sample process, and notice the various parameters :)

I hope all of you RapidMiner's can easily discover how it works.

We will follow up with a video and how to article shortly here.

Let us know your feedback.

 

 

Answers

  • suleymansahalsuleymansahal Member Posts: 27 Contributor II

    Thank you for this new operator! How can we use it in cross validation setting? I could not manage to build a process where we can test the models on unseen data and take the average performances. I tried something like the attached but it is not cross validation. Also, I noticed in your log file it is intended to record average performances and their deviations. What was your intention in doing so? Thanks in advance.

  • bhupendra_patilbhupendra_patil Administrator, Employee, Member Posts: 168 RM Data Scientist

     Hello @suleymansahal

    Thank you for your feedback.

     

    The model management operator is designed to test models that are build on same dataset and by using a common test dataset to give you performance indicators of which is the best. At this point it doesnot support cross validaton easily. 

    We can potentially look into adding additonal features to this, or come up with new operators

     

    Here is what the scenario we had in mind during building this operator

    1) Lets say you build a model (cross validated) on January 1 and start using it in production. 

    2) You have a need to determine if this model is performing well in real world or not and update automatically

    3) So you can schedule to  retrain a group of model on latest data everyday/every month etc.., and then feed this freshly generated models and the current one in production and test against a common data set to the new "Compare model" operator. If your freshly generated models are better the output will give you one those models. You can add a store to the mod output port and over write the production model.

     

    If production model is itself better then the store will overwrite with the exisitng model

     

    The log of various performance vector etc is to keep a track of all model performance over a period of time, so you can see if models are improving/degrading over time

     

    Hope this helps.

    You input is valubale so we can look into how we can incorporate that in next iteration fo the model

  • suleymansahalsuleymansahal Member Posts: 27 Contributor II

    Hi. Ok. I understand the logic. When I first saw it I thought I could use it like the Compare ROCs operator which has cross validation built-in. Thanks again.

Sign In or Register to comment.