"Basic Feature Selection Process"

smackdown33smackdown33 Member Posts: 19 Maven
edited May 2019 in Help
Hi ive been trying to run the following XML code with varying heap space arrangements and nothing has come to fruition. I keep getting GC errors or heap space errors. The dataset consists of 1300 examples with 7200 attributes but after about 10secs the previously mentioned errors arise. Can someone please help.

<operator name="Root" class="Process" expanded="yes">
    <operator name="ArffExampleSource" class="ArffExampleSource">
        <parameter key="data_file" value="G:\Postgrad\Java\NetBean Projects\ImageUtil\lutImages\newPNGImages\10Colours\Features\features2.arff"/>
        <parameter key="datamanagement" value="short_sparse_array"/>
    </operator>
    <operator name="FS" class="FeatureSelection" expanded="yes">
        <operator name="FSChain" class="OperatorChain" expanded="yes">
            <operator name="XValidation" class="XValidation" expanded="yes">
                <operator name="Learner" class="LibSVMLearner">
                    <list key="class_weights">
                    </list>
                </operator>
                <operator name="ApplierChain" class="OperatorChain" expanded="yes">
                    <operator name="Applier" class="ModelApplier">
                        <list key="application_parameters">
                        </list>
                    </operator>
                    <operator name="Evaluator" class="Performance">
                    </operator>
                </operator>
            </operator>
            <operator name="ProcessLog" class="ProcessLog">
                <list key="log">
                  <parameter key="generation" value="operator.FS.value.generation"/>
                  <parameter key="performance" value="operator.FS.value.performance"/>
                </list>
            </operator>
        </operator>
    </operator>
</operator>

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    that's correct. Unfortunately the current implementation of the feature selection needs way too much memory. Please take a look at this forum thread, where we had the complete discussion, yet. http://rapid-i.com/rapidforum/index.php/topic,1089.0.html

    Greetings,
      Sebastian
  • smackdown33smackdown33 Member Posts: 19 Maven
    Hi Sebastian,

    thanks for the response, ive read the other thread, but it didnt really get me anywhere. I was wondering do you or anyone out there know the best way to select a reduced feature set from the 7200 attributes I have and then train an SVM using the reduced feature set with the computer limitations i have.
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    Did you try the evolutionary Feature Selection?

    Greetings,
      Sebastian
  • smackdown33smackdown33 Member Posts: 19 Maven
    Hi sebastian,

    is that the evolutionary feature aggregation which you are referring to. Its the only feature selection method that i can find that is "evolutionary".

    Thanks again for your help.
  • smackdown33smackdown33 Member Posts: 19 Maven
    Hi, below is the setup i most recently used and Im still not getting anything from it, keeps running out of memory after at most 1min. If you can help me get this sorted out, you will be a life save Sebastian, thanks for your help.
    <operator name="Root" class="Process" expanded="yes">
          <operator name="ArffExampleSource" class="ArffExampleSource">
            <parameter key="data_file" value="G:\Postgrad\Java\NetBean Projects\ImageUtil\lutImages\newPNGImages\10Colours\Features\features2.arff"/>
        </operator>
        <operator name="EvolutionaryFeatureAggregation" class="EvolutionaryFeatureAggregation" expanded="yes">
            <operator name="XValidation" class="XValidation" expanded="yes">
                <operator name="JMySVMLearner" class="JMySVMLearner">
                </operator>
                <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                    <operator name="ModelApplier" class="ModelApplier">
                        <list key="application_parameters">
                        </list>
                    </operator>
                    <operator name="Performance" class="Performance">
                    </operator>
                </operator>
            </operator>
            <operator name="ProcessLog" class="ProcessLog">
                <list key="log">
                </list>
            </operator>
        </operator>
    </operator>
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    beside from buying the metioned plugin? Hmm. You could do an evolutionaryWeighting, this should switch of unimportant attributes by giving them weight 0. Hope that will help you.

    Greetings,
      Sebastian
  • smackdown33smackdown33 Member Posts: 19 Maven
    Hi Sebastian,

    can you PM me the price of the plugin, thanks.
Sign In or Register to comment.