"Diffrence between Feature selection and Feature Weighting"

IwanIwan Member Posts: 9 Contributor I
edited June 2019 in Help

Hi, I'm curios to know what is the big difference between Feature Selection and Feature Weighting ?

Perhaps it's just the same. Right now ,I'm doing some project with Optimize weight (PSO). Is it the same with feature selection ?

Can you guys, tell me the basic concepts about Optimize weight (PSO), coz i don't have clue about it....

Some good reading will be fine....

 

Thanks.

Iwan

 

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Feature selection is usually done with some sort of Performance Measure. You try different combinations of your features and measure how good or bad the performance will be. At the end of the iterations, the feature selection operator (there are a few), will return the best combination of features with the best performace. It will typically be a reduced set of features for your downstream processing.

     

    Feature weighting just uses the selected algorithm (i.e. Weight by SVM) to see how much influence each feature has in a classification problem. You can use a Select By Weights operator to get the "top k" of features for your downstream processing.

  • IwanIwan Member Posts: 9 Contributor I

    Hi Thomas,


    Could you tell me more about optimize weight (PSO) ? 

    I know that the output is  the feature weight. Is it (PSO) automatically select the best weight to the next step (perhaps classification procedure) , or we have to use "Select by weight" operator ......
    I really need basic concepts about optimize weight PSO, and maybe an example of it.

    For answering my question, big thanks to you.

     

    Iwan

     

     

     

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    The Optimize Weights (PSO) operator uses a particle swarm optimization method to generate the feature weights but this type of method is use to reduce your attributes. It's typically a minimization technique.  You can then export the selected features and their weights downstream in the process and then use the weights to automatically select the features from your scoring set.  I do this quite a bit, see the attached example.

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.2.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.2.000" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="7.2.000" expanded="true" height="68" name="Retrieve Sonar" width="90" x="45" y="85">
    <parameter key="repository_entry" value="//Samples/data/Sonar"/>
    </operator>
    <operator activated="true" class="optimize_weights_pso" compatibility="7.2.000" expanded="true" height="103" name="Optimize Weights (PSO)" width="90" x="246" y="85">
    <process expanded="true">
    <operator activated="true" class="x_validation" compatibility="5.0.000" expanded="true" height="124" name="Validation (2)" width="90" x="45" y="34">
    <parameter key="sampling_type" value="2"/>
    <process expanded="true">
    <operator activated="true" class="parallel_decision_tree" compatibility="7.2.000" expanded="true" height="76" name="Decision Tree (2)" width="90" x="45" y="30"/>
    <connect from_port="training" to_op="Decision Tree (2)" to_port="training set"/>
    <connect from_op="Decision Tree (2)" from_port="model" to_port="model"/>
    <portSpacing port="source_training" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    <portSpacing port="sink_through 1" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="apply_model" compatibility="5.0.000" expanded="true" height="76" name="Apply Model (3)" width="90" x="45" y="30">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="performance" compatibility="5.0.000" expanded="true" height="76" name="Performance (2)" width="90" x="179" y="30"/>
    <connect from_port="model" to_op="Apply Model (3)" to_port="model"/>
    <connect from_port="test set" to_op="Apply Model (3)" to_port="unlabelled data"/>
    <connect from_op="Apply Model (3)" from_port="labelled data" to_op="Performance (2)" to_port="labelled data"/>
    <connect from_op="Performance (2)" from_port="performance" to_port="averagable 1"/>
    <portSpacing port="source_model" spacing="0"/>
    <portSpacing port="source_test set" spacing="0"/>
    <portSpacing port="source_through 1" spacing="0"/>
    <portSpacing port="sink_averagable 1" spacing="0"/>
    <portSpacing port="sink_averagable 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">A cross-validation evaluating a decision tree model.</description>
    </operator>
    <operator activated="true" class="performance_attribute_count" compatibility="7.2.000" expanded="true" height="82" name="Performance (3)" width="90" x="179" y="34"/>
    <connect from_port="example set" to_op="Validation (2)" to_port="training"/>
    <connect from_op="Validation (2)" from_port="training" to_op="Performance (3)" to_port="example set"/>
    <connect from_op="Validation (2)" from_port="averagable 1" to_op="Performance (3)" to_port="performance"/>
    <connect from_op="Performance (3)" from_port="performance" to_port="performance"/>
    <portSpacing port="source_example set" spacing="0"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_performance" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="x_validation" compatibility="5.0.000" expanded="true" height="124" name="Validation" width="90" x="447" y="85">
    <parameter key="sampling_type" value="2"/>
    <process expanded="true">
    <operator activated="true" class="parallel_decision_tree" compatibility="7.2.000" expanded="true" height="76" name="Decision Tree" width="90" x="45" y="30"/>
    <connect from_port="training" to_op="Decision Tree" to_port="training set"/>
    <connect from_op="Decision Tree" from_port="model" to_port="model"/>
    <portSpacing port="source_training" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    <portSpacing port="sink_through 1" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="apply_model" compatibility="5.0.000" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="performance" compatibility="5.0.000" expanded="true" height="76" name="Performance" width="90" x="179" y="30"/>
    <connect from_port="model" to_op="Apply Model" to_port="model"/>
    <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
    <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
    <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
    <portSpacing port="source_model" spacing="0"/>
    <portSpacing port="source_test set" spacing="0"/>
    <portSpacing port="source_through 1" spacing="0"/>
    <portSpacing port="sink_averagable 1" spacing="0"/>
    <portSpacing port="sink_averagable 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">A cross-validation evaluating a decision tree model.</description>
    </operator>
    <operator activated="true" class="retrieve" compatibility="7.2.000" expanded="true" height="68" name="Retrieve Sonar (2)" width="90" x="246" y="340">
    <parameter key="repository_entry" value="//Samples/data/Sonar"/>
    </operator>
    <operator activated="true" class="select_by_weights" compatibility="7.2.000" expanded="true" height="103" name="Select by Weights" width="90" x="514" y="340">
    <parameter key="weight" value="0.0"/>
    </operator>
    <operator activated="true" class="apply_model" compatibility="7.2.000" expanded="true" height="82" name="Apply Model (2)" width="90" x="715" y="340">
    <list key="application_parameters"/>
    </operator>
    <connect from_op="Retrieve Sonar" from_port="output" to_op="Optimize Weights (PSO)" to_port="example set"/>
    <connect from_op="Optimize Weights (PSO)" from_port="weights" to_op="Select by Weights" to_port="weights"/>
    <connect from_op="Optimize Weights (PSO)" from_port="example set" to_op="Validation" to_port="training"/>
    <connect from_op="Validation" from_port="model" to_op="Apply Model (2)" to_port="model"/>
    <connect from_op="Retrieve Sonar (2)" from_port="output" to_op="Select by Weights" to_port="example set input"/>
    <connect from_op="Select by Weights" from_port="example set output" to_op="Apply Model (2)" to_port="unlabelled data"/>
    <connect from_op="Apply Model (2)" from_port="labelled data" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

     

     

Sign In or Register to comment.