"Brute Force Feature Selection"

pengiepengie Member Posts: 21 Maven
edited June 2019 in Help
I am using the Brute Force Feature Selection operator to go through all different combinations of my attributes (total: 10). Is there a way to create a file containing the attributes that are used and the corresponding performance in each row so that I can see exactly which combination of attributes gives what performance? E.g.

Features      Performance
A,B,C                0.8
B,D,G                0.6
C,D,E,F,G          0.7
B,E,H                  0.9
...

I have tried using ProcessLog operator but the feature_name value in the Brute Force Feature Selection operator don't seem to work. It just output the value '?'.

Answers

  • cherokeecherokee Member Posts: 82 Guru
    Hi pengie!

    I think it should work with the log operator. Can you please post your process setup. This way we can exactly see what you are doing and helpmore appropriate.

    Best regards,
    chero
  • pengiepengie Member Posts: 21 Maven
    Process setup for RapidMiner 4.5

    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
            <parameter key="target_function" value="two gaussians classification"/>
            <parameter key="number_of_attributes" value="10"/>
        </operator>
        <operator name="BruteForce" class="BruteForce" expanded="yes">
            <operator name="NaiveBayes" class="NaiveBayes">
                <parameter key="keep_example_set" value="true"/>
            </operator>
            <operator name="ModelApplier" class="ModelApplier">
                <list key="application_parameters">
                </list>
            </operator>
            <operator name="BinominalClassificationPerformance" class="BinominalClassificationPerformance">
                <parameter key="main_criterion" value="youden"/>
                <parameter key="sensitivity" value="true"/>
                <parameter key="specificity" value="true"/>
                <parameter key="youden" value="true"/>
            </operator>
            <operator name="ProcessLog" class="ProcessLog">
                <parameter key="filename" value="log.txt"/>
                <list key="log">
                  <parameter key="features" value="operator.BruteForce.value.feature_names"/>
                  <parameter key="sensitivity" value="operator.BinominalClassificationPerformance.value.sensitivity"/>
                  <parameter key="specificity" value="operator.BinominalClassificationPerformance.value.specificity"/>
                </list>
            </operator>
        </operator>
    </operator>
    Process setup for RapidMiner 5 RC

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input>
          <location/>
        </input>
        <output>
          <location/>
          <location/>
        </output>
        <macros/>
      </context>
      <operator activated="true" class="process" expanded="true" name="Process">
        <process expanded="true" height="471" width="279">
          <operator activated="true" class="generate_data" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
            <parameter key="target_function" value="two gaussians classification"/>
            <parameter key="number_of_attributes" value="10"/>
          </operator>
          <operator activated="true" class="optimize_selection_brute_force" expanded="true" height="94" name="Optimize Selection (Brute Force)" width="90" x="179" y="30">
            <process expanded="true" height="489" width="679">
              <operator activated="true" class="naive_bayes" expanded="true" height="76" name="Naive Bayes" width="90" x="45" y="30"/>
              <operator activated="true" class="apply_model" expanded="true" height="76" name="Apply Model" width="90" x="179" y="30">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="performance_binominal_classification" expanded="true" height="76" name="Performance" width="90" x="313" y="30">
                <parameter key="main_criterion" value="youden"/>
                <parameter key="sensitivity" value="true"/>
                <parameter key="specificity" value="true"/>
                <parameter key="youden" value="true"/>
              </operator>
              <operator activated="true" class="log" expanded="true" height="76" name="Log" width="90" x="447" y="30">
                <parameter key="filename" value="log.txt"/>
                <list key="log">
                  <parameter key="features" value="operator.Optimize Selection (Brute Force).value.feature_names"/>
                  <parameter key="sensitivity" value="operator.Performance.value.sensitivity"/>
                  <parameter key="specificity" value="operator.Performance.value.specificity"/>
                </list>
              </operator>
              <connect from_port="example set" to_op="Naive Bayes" to_port="training set"/>
              <connect from_op="Naive Bayes" from_port="model" to_op="Apply Model" to_port="model"/>
              <connect from_op="Naive Bayes" from_port="exampleSet" to_op="Apply Model" to_port="unlabelled data"/>
              <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
              <connect from_op="Performance" from_port="performance" to_op="Log" to_port="through 1"/>
              <connect from_op="Log" from_port="through 1" to_port="performance"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_performance" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Optimize Selection (Brute Force)" to_port="example set in"/>
          <connect from_op="Optimize Selection (Brute Force)" from_port="example set out" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
  • haddockhaddock Member Posts: 849 Maven
    Hi there,

    Chero's right, you'll need to iterate over the attributes to build a macro, and then log that, like this...
    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
            <parameter key="target_function" value="two gaussians classification"/>
            <parameter key="number_of_attributes" value="10"/>
        </operator>
        <operator name="BruteForce" class="BruteForce" expanded="yes">
            <operator name="Reset" class="SingleMacroDefinition">
                <parameter key="macro" value="All"/>
                <parameter key="value" value="Using-"/>
            </operator>
            <operator name="FeatureIterator" class="FeatureIterator" expanded="no">
                <operator name="SingleMacroDefinition" class="SingleMacroDefinition">
                    <parameter key="macro" value="All"/>
                    <parameter key="value" value="%{All},%{loop_feature}"/>
                </operator>
            </operator>
            <operator name="NaiveBayes" class="NaiveBayes">
                <parameter key="keep_example_set" value="true"/>
            </operator>
            <operator name="ModelApplier" class="ModelApplier">
                <list key="application_parameters">
                </list>
            </operator>
            <operator name="BinominalClassificationPerformance" class="BinominalClassificationPerformance">
                <parameter key="main_criterion" value="youden"/>
                <parameter key="sensitivity" value="true"/>
                <parameter key="specificity" value="true"/>
                <parameter key="youden" value="true"/>
            </operator>
            <operator name="Macro2Log" class="Macro2Log">
                <parameter key="macro_name" value="All"/>
            </operator>
            <operator name="ProcessLog" class="ProcessLog">
                <list key="log">
                  <parameter key="features" value="operator.Macro2Log.value.macro_value"/>
                  <parameter key="sensitivity" value="operator.BinominalClassificationPerformance.value.sensitivity"/>
                  <parameter key="specificity" value="operator.BinominalClassificationPerformance.value.performance"/>
                </list>
            </operator>
        </operator>
    </operator>
  • pengiepengie Member Posts: 21 Maven
    Dear Haddock,

    Thank you. It worked perfectly.  :)
Sign In or Register to comment.