Caching in RapidMiner using Old World Computing's Jackhammer Extension: Cache Dependencies

Leonie_OWCLeonie_OWC Member, KB Contributor Posts: 12 Contributor II
edited December 2019 in Knowledge Base

Using Macros to Set Cache Dependencies

This is the last of our tutorials for the Cache operator of our Jackhammer Extension for RapidMiner. We hope this helps you in your day-to-day RapidMiner tasks! If you have any further questions or would like to request tutorials for other operators of our extensions, feel free to send us a message here or on Twitter.

Recap

The previous two tutorials on using the caching functions of the Jackhammer Extension demonstrated the basic features of the Cache operator and how to integrate it into your processes. For this, we constructed the scenario that you as the company’s data scientist are tasked with making the data sent by the new wind turbine available to your coworkers. After constructing a basic process, we went into the more advanced functions like setting data validity periods in our second tutorial, at the end of which we ran into a problem: if the first employee checking the wind turbine data does so at 10 am, but on the next day, an employee checks it at 8.30 am, i.e. before the 24 hours have passed, the data will not be renewed, even though theoretically, the turbine has already got new data. How to solve this issue will be the topic of this tutorial. We will use cache dependencies to set our cache to reload as soon as it is the next day. For this, we will be using macros.

Step 1

Open your caching process in RapidMiner and add the Generate Macros operator to it. It is important that the macro operator is executed before the cache. To ensure this, place it in front of the cache operator and make a connection from the right output port of the macro to the left input port of the cache operator. This way, the order is fixed and you can be sure the cache receives the macro. Also note that macros do work without connections, we are only doing this to determine the execution order.


 

Step 2

In the parameter settings of the Generate Macro operator, click on Edit List and enter what is shown in the screenshot below:


This will cause the operator to generate a macro containing the current date. Click apply.

 

Step 3

Move to the Cache operator and find the parameter for cache dependencies. Click on the button “Edit Enumeration”:

Step 4

In the window that is now opening, simply enter the name of your macro, in this case “date” and hit “OK”.


Now you are all done – it is as simple as that! As soon as the macro changes, the cache will be cleared and load the new data.



sgenzerIngoRMTghadially
Sign In or Register to comment.