Create Model from Rule

michaelglovenmichaelgloven RapidMiner Certified Analyst, Member Posts: 46 Guru
edited November 2018 in Help

I have a known set of rules from an industry engineering document, expressed as a series of if-then-else statements and some engineering calculations. The output of these rules are similar to classification results in RM.

 

We do not have data underlying for these rules so we are unable to create models following the normal RM process of learning and training. Is there a way to input these explicit rules into a model and as we gather data over time apply this model to this data to validate the model? Normally, we start with data to create the model, but in this case we want to use RM to somehow create a model with known rules and then validate and improve over time. In summary, it'd be great if there was an easy way to enter the rules in a model independent of underlying data.

 

Any ideas are appreciated!

Best Answer

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @michaelgloven maybe I'm not understanding you compleletly but couldn't you just input these if-then statements into a Generate Attribute operator? Then as you gather they data, they execute your rules and you get your results?  Or do you want to use the if-then statements as a starting point and then as your data comes in, you adjust the if-then statements to fit the data?

     

     

  • michaelglovenmichaelgloven RapidMiner Certified Analyst, Member Posts: 46 Guru

    more of your second point....for example, the generate attribute set of rules will not, as far as I know, allow me to create performance vectors to allow me to see how well my model (rules) are performing. I'm trying to bridge from potentially unreliable "fixed rules" to "data driven rules", but I don't have any data yet. So, if I was for example able to "import" induction or decision-tree rules into a blank model, I would then be able to input data over time and use cross validation to determine performance. This may tell me the historical rules are wrong and I need to modify my features or methods to make predictions. I suspect there is bias and over conservatism in some of these original rule-sets, and believe there is a value prop in using RM to support my assertion.

  • earmijoearmijo Member Posts: 270 Unicorn

    @michaelgloven wrote:

    more of your second point....for example, the generate attribute set of rules will not, as far as I know, allow me to create performance vectors to allow me to see how well my model (rules) are performing. 

    You could. Following @Thomas_Ott suggestion, you could generate a new attribute, change its role from "regular" to "prediction" and compare it to the label (once you have data of course). 

     

    Take a look at the following simple example.

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="8.0.001" expanded="true" height="68" name="Retrieve Golf" width="90" x="112" y="136">
    <parameter key="repository_entry" value="//Samples/data/Golf"/>
    </operator>
    <operator activated="true" class="generate_attributes" compatibility="8.0.001" expanded="true" height="82" name="Generate Attributes" width="90" x="313" y="136">
    <list key="function_descriptions">
    <parameter key="prediction" value="if(Outlook == &quot;overcast&quot;,&quot;yes&quot;,&quot;no&quot;)"/>
    </list>
    </operator>
    <operator activated="true" class="set_role" compatibility="8.0.001" expanded="true" height="82" name="Set Role" width="90" x="447" y="136">
    <parameter key="attribute_name" value="prediction"/>
    <parameter key="target_role" value="prediction"/>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="performance_classification" compatibility="8.0.001" expanded="true" height="82" name="Performance" width="90" x="648" y="136">
    <list key="class_weights"/>
    </operator>
    <connect from_op="Retrieve Golf" from_port="output" to_op="Generate Attributes" to_port="example set input"/>
    <connect from_op="Generate Attributes" from_port="example set output" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_op="Performance" to_port="labelled data"/>
    <connect from_op="Performance" from_port="performance" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

     

Sign In or Register to comment.