Options

"Applying generated weights to attributes"

schillsschills Member Posts: 16 Contributor II
edited May 2019 in Help
Hello

I have just run a process involving optimize wights, and so the output I am given displays the optimal weights for my attributes.
My question is how do I apply these weights to a new model/data set?

Also, when I select any of the attribute weighting operators, it will display the weights I should use in the output, but doesnt actually apply those weights to my model/data......how do I apply these weights to a new set of data i am running through a neural net? I want to be able to use past data and weight attributes accordingly in order to help predict outcomes for a new data set.

Any help on this topic will be greatly appreciated

Cheers guys

Schills
Tagged:

Answers

  • Options
    steffensteffen Member Posts: 347 Maven
    Hello schills

    I think you are looking for this operator:  DataTransformation->Selection->Select by Weights.

    To include attribute weights directly into a model, the learning operator has to allow this. E.g. Neural Net does not, but NaiveBayes (theoretically, as far as I see no implementation in rm reflects this). Just check the input ports / descriptions of the various learners.

    Since you can store/load weights easily, I think you are able to figure out the application of weights to new data on your own (see it as a puzzle).

    hope this was helpful,

    steffen
  • Options
    schillsschills Member Posts: 16 Contributor II
    Hi Steffen

    Thanks for your reply!

    The operator "select by weights" seems to only select attributes that have a weight and then use these attributes when i run the data through a SVM.
    However, the weights are not actually being applied to my data in my model.
    The generation of the weights is not the issue, however applying them to my model/data to give new results is the issue.

    I tried to figure out the application of weights to my new data, but am still not solving this problem
    Could you please let me know how to do this? and how to store/load weights?

    Thanking you in advance
    Schills
  • Options
    haddockhaddock Member Posts: 849 Maven
    Hi there,

    The "scale_by_weights" operator alters your data, like this...
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.1.001">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.0.000" expanded="true" name="Root">
        <description>&lt;p&gt; A simple and usually fast possibility to perform feature selection is to first calculate attribute weights from the given data set (in this process: Relief) and to apply an AttributeWeightSelection operator afterwards. This operator deselects all features not fulfilling a given weight relation. &lt;/p&gt; &lt;p&gt;This is usually referred to as &amp;quot;filter approach&amp;quot; since no other information than the data set is used. If the performance of a specific learner should be taken into account we refer to this as &amp;quot;wrapper approach&amp;quot;. The next sample processes give examples for different wrapper approaches for feature weighting, selection, and construction.&lt;/p&gt;</description>
        <process expanded="true" height="604" width="480">
          <operator activated="true" class="retrieve" compatibility="5.0.000" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
            <parameter key="repository_entry" value="//Samples/data/Polynomial"/>
          </operator>
          <operator activated="true" class="multiply" compatibility="5.1.001" expanded="true" height="94" name="Multiply" width="90" x="112" y="210"/>
          <operator activated="true" class="weight_by_relief" compatibility="5.0.000" expanded="true" height="76" name="Relief" width="90" x="179" y="30"/>
          <operator activated="true" class="scale_by_weights" compatibility="5.1.001" expanded="true" height="76" name="Scale by Weights" width="90" x="380" y="30"/>
          <connect from_op="Retrieve" from_port="output" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Relief" to_port="example set"/>
          <connect from_op="Multiply" from_port="output 2" to_port="result 2"/>
          <connect from_op="Relief" from_port="weights" to_op="Scale by Weights" to_port="weights"/>
          <connect from_op="Relief" from_port="example set" to_op="Scale by Weights" to_port="example set"/>
          <connect from_op="Scale by Weights" from_port="example set" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
        </process>
      </operator>
    </process>
    As Steff has pointed out, weights are properties of attributes, and can be imported and exported in the normal way.


  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,

    as every object in RapidMiner you can use the "Store" and "Retrieve" operator to save and load the objects to and from repository.
    If you want to export the weights into another application or something like this, you can use "Weights to Data" and export the resulting ExampleSet with one of the Write operators.

    Greetings,
      Sebastian
  • Options
    schillsschills Member Posts: 16 Contributor II
    Ok, i must be missing the point, because I have spent hours on this and still cant figure it out.....

    I understand how to store and retrieve the weights, and can set the role as "weight", but this doesnt help me apply the weights to data....am I right, or is there a way to apply weight to data through this method? There is only 1 input port into learner models, so I cant use both the retrieved weights and the retrieved data?

    If weights are properties of attributes, how do the weights get applied to all the data for each attribute, and not just the attribute itself?
    How does storing/loading weights allow you to apply these weights to new data and then run through a training model? What is the step i am missing that allows me to go from loading (retrieving) weights to apply it to new data so that data is then changed according to the weight?

    Basically, if my overall model is ax+by+dz = G, where a,b, c are the weights, x,y z are the attributes and G is the model's predicted output, i wish to be able to set a,b and c.

    The "scale by weights" operator seems to alter the data by applying the weights, however when i run this through a SVM and apply the model to new data, the whole process does not work for some reason. May this be because the SVM alredy applies weights to the data, and so any additional weighting will not work?

    Any info would be appreciated
    Cheers guys
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,

    you can weight single attributes by applying the weights as you have already found out. If you have new data and want to apply the model, you of course have to scale the new data again with the weights. Otherwise the new data will be completely different.
    See the following process as an example:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.1.002">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.1.002" expanded="true" name="Process">
        <parameter key="parallelize_main_process" value="true"/>
        <process expanded="true" height="633" width="547">
          <operator activated="true" class="generate_data" compatibility="5.1.002" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
            <parameter key="target_function" value="polynomial classification"/>
          </operator>
          <operator activated="true" class="add_noise" compatibility="5.1.002" expanded="true" height="94" name="Add Noise" width="90" x="180" y="30">
            <list key="noise"/>
          </operator>
          <operator activated="true" class="x_validation" compatibility="5.0.000" expanded="true" height="112" name="Validation" width="90" x="447" y="30">
            <description>A cross-validation evaluating a decision tree model.</description>
            <process expanded="true" height="654" width="466">
              <operator activated="true" breakpoints="after" class="weight_by_relief" compatibility="5.1.002" expanded="true" height="76" name="Weight by Relief" width="90" x="45" y="30"/>
              <operator activated="true" class="multiply" compatibility="5.1.002" expanded="true" height="94" name="Multiply" width="90" x="45" y="210"/>
              <operator activated="true" class="scale_by_weights" compatibility="5.1.002" expanded="true" height="76" name="Scale by Weights" width="90" x="179" y="75"/>
              <operator activated="true" class="decision_tree" compatibility="5.0.000" expanded="true" height="76" name="Decision Tree" width="90" x="313" y="30"/>
              <connect from_port="training" to_op="Weight by Relief" to_port="example set"/>
              <connect from_op="Weight by Relief" from_port="weights" to_op="Multiply" to_port="input"/>
              <connect from_op="Weight by Relief" from_port="example set" to_op="Scale by Weights" to_port="example set"/>
              <connect from_op="Multiply" from_port="output 1" to_op="Scale by Weights" to_port="weights"/>
              <connect from_op="Multiply" from_port="output 2" to_port="through 1"/>
              <connect from_op="Scale by Weights" from_port="example set" to_op="Decision Tree" to_port="training set"/>
              <connect from_op="Decision Tree" from_port="model" to_port="model"/>
              <portSpacing port="source_training" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
              <portSpacing port="sink_through 2" spacing="0"/>
            </process>
            <process expanded="true" height="654" width="476">
              <operator activated="true" class="scale_by_weights" compatibility="5.1.002" expanded="true" height="76" name="Scale by Weights (2)" width="90" x="45" y="75"/>
              <operator activated="true" class="apply_model" compatibility="5.0.000" expanded="true" height="76" name="Apply Model" width="90" x="179" y="30">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="performance" compatibility="5.0.000" expanded="true" height="76" name="Performance" width="90" x="376" y="30"/>
              <connect from_port="model" to_op="Apply Model" to_port="model"/>
              <connect from_port="test set" to_op="Scale by Weights (2)" to_port="example set"/>
              <connect from_port="through 1" to_op="Scale by Weights (2)" to_port="weights"/>
              <connect from_op="Scale by Weights (2)" from_port="example set" to_op="Apply Model" to_port="unlabelled data"/>
              <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
              <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="source_through 2" spacing="0"/>
              <portSpacing port="sink_averagable 1" spacing="0"/>
              <portSpacing port="sink_averagable 2" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Add Noise" to_port="example set input"/>
          <connect from_op="Add Noise" from_port="example set output" to_op="Validation" to_port="training"/>
          <connect from_op="Validation" from_port="averagable 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="90"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Greetings,
      Sebastian
Sign In or Register to comment.