Writing a RapidMiner process to disk

rachel_lomaskyrachel_lomasky Member Posts: 52 Guru
edited November 2018 in Help

I'd like to write my RapidMiner processes to disk (to add to source control).  I was able to do this with the web apps using Open File (choosing the web app from the repository), Read Document (as XML), Write Document.  

 

However, wjen I try with the processes, I get:

Dec 9, 2016 11:49:22 AM SEVERE: Process failed: The object located at //Server2/processes/analysis/aggregations/conversion_ages does not match the expected object type blob (but is: process).
Dec 9, 2016 11:49:22 AM SEVERE: Here:
Dec 9, 2016 11:49:22 AM SEVERE: Process[1] (Process)
Dec 9, 2016 11:49:22 AM SEVERE: subprocess 'Main Process'
Dec 9, 2016 11:49:22 AM SEVERE: ==> +- Open File[1] (Open File)
Dec 9, 2016 11:49:22 AM SEVERE: +- Read Document[0] (Read Document)
Dec 9, 2016 11:49:22 AM SEVERE: +- Write Document[0] (Write Document)

 

Any way to accomplish this?

 

Thanks,

Rachel

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Can you do File > Export Process?You can select the file type as XML.

  • rachel_lomaskyrachel_lomasky Member Posts: 52 Guru

    Thanks, I've been doing that as a stop gap. But I'd like something that I can automate, if possible.

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder

    Interesting idea!  But I am afraid that this is currently not easily possible though :smileysad:

     

    I am sure one of our devs could whip together a small Groovy script to retrieve the process XML from a process so that you can write this afterwards.  Or you could try it yourself.  Most necessary parts should be described here:

     

    Cheers,

    Ingo

  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 578 Unicorn

    Well that's annoying.  :smileyfrustrated:

     

    However, as mentioned previously if you know your way around the RapidMiner Server repository structure and have a connection in RapidMiner that can read from it then you can get it reasonably easily. 

     

    Here's a very simple example process.  The first Read Database operator provides a view of all repository entry items so that you can select which one you want the XML for using Set Macro.  I left it showing all types of entry so you can get a feel on how the folder structure works. 

     

    The next Read Database operator is the one that actually does the legwork and checks the selected entry against the version table and finally to the buffer table which contains the XML. 

     

    SELECT b.buffer
    FROM ra_ent_entry as e
    /* I'm not bothering checking most recent versions here. */
    INNER JOIN ra_ent_version as v ON e.id = v.entry_id
    /* You really should. */
    INNER JOIN ra_ent_bytebuffer as b ON v.xmlBuffer_id = b.id
    WHERE e.id = ? /* change the macro in Set Macro to Automate */

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.3.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.3.000" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="jdbc_connectors:read_database" compatibility="7.3.000" expanded="true" height="68" name="Get a list of all processes" width="90" x="112" y="238">
    <parameter key="connection" value="RapidMiner Repo"/>
    <parameter key="query" value="SELECT * &#10;FROM ra_ent_entry as e"/>
    <enumeration key="parameters"/>
    <description align="center" color="transparent" colored="false" width="126">Note the nesting for folder structure.</description>
    </operator>
    <operator activated="true" class="set_macro" compatibility="7.3.000" expanded="true" height="82" name="Set Macro" width="90" x="313" y="187">
    <parameter key="macro" value="MyDesiredProcess"/>
    <parameter key="value" value="15"/>
    </operator>
    <operator activated="true" class="jdbc_connectors:read_database" compatibility="7.3.000" expanded="true" height="68" name="Read Database" width="90" x="581" y="289">
    <parameter key="connection" value="RapidMiner Repo"/>
    <parameter key="query" value="SELECT b.buffer&#10;FROM ra_ent_entry as e&#10;/* I'm not bothering checking most recent versions here. */ &#10;INNER JOIN ra_ent_version as v ON e.id = v.entry_id &#10;/* You really should. */&#10;INNER JOIN ra_ent_bytebuffer as b ON v.xmlBuffer_id = b.id&#10;WHERE e.id = ? /* change the macro in Set Macro to Automate */ "/>
    <parameter key="prepare_statement" value="true"/>
    <enumeration key="parameters">
    <parameter key="parameter" value="INTEGER.myDesiredProcess"/>
    </enumeration>
    <description align="center" color="transparent" colored="false" width="126">Output from Buffer is the XML of your process.</description>
    </operator>
    <connect from_op="Get a list of all processes" from_port="output" to_op="Set Macro" to_port="through 1"/>
    <connect from_op="Set Macro" from_port="through 1" to_port="result 1"/>
    <connect from_op="Read Database" from_port="output" to_port="result 2"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    <description align="center" color="yellow" colored="false" height="103" resized="true" width="304" x="10" y="10">Note. To use this you need to ensure you have a connection setup in RapidMiner that can read from the RapidMiner Server repository</description>
    </process>
    </operator>
    </process>

    You should be able to use this as a starting point to automate completely.  (Also, shame on you @IngoRM for saying it's not possible.  You should know with RapidMiner it's all possible.  :smileytongue: )

  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 578 Unicorn

    Incidentally if you want the XML of the process you are currently running then you can use this 3 line Groovy Script to generate it as a Macro.

     

    import com.rapidminer.*;
    operator.getProcess().getMacroHandler().addMacro("processXML", operator.getProcess().toString());
    return input;

     

    It only works for the process you are currently in though so wouldn't be too suitable for the version control you want.

     

    Credit to Andrew C. http://rapidminernotes.blogspot.jp/2013/05/saving-example-set-with-details-of.html

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder

    Hey, I did not say "impossible", I only said it is "not easily possible" :smileywink:

     

    I feel that must admit though that I forgot about going right into the database.  Good solution!

Sign In or Register to comment.