Outlier Detection

annicaannica Member Posts: 1 Contributor I
edited November 2018 in Help
Hello
I try to use an outlier detection like distance based outlier detection.
I thought that, if i apply this function this outliers are going to be ignored for further calculations.
But if I apply this function or not; I can not see any differences.
this is my xml part:
<operator name="Root" class="Process" expanded="yes">
    <parameter key="logverbosity" value="warning"/>
    <operator name="Red Wine Example Data" class="ExampleSource">
        <parameter key="attributes" value="/home/annica/rm_workspace/projekt/wine.aml"/>
    </operator>
    <operator name="DistanceBasedOutlierDetection" class="DistanceBasedOutlierDetection">
        <parameter key="number_of_neighbors" value="2"/>
        <parameter key="number_of_outliers" value="14"/>
    </operator>
    <operator name="XValidation" class="XValidation" expanded="yes">
        <operator name="Training" class="OperatorChain" expanded="yes">
            <operator name="W-SMO" class="W-SMO">
            </operator>
            <operator name="ModelWriter" class="ModelWriter">
                <parameter key="model_file" value="/home/annica/rm_workspace/wineModel.mod"/>
            </operator>
        </operator>
        <operator name="Testing" class="OperatorChain" expanded="yes">
            <operator name="ModelApplier" class="ModelApplier">
                <list key="application_parameters">
                </list>
            </operator>
            <operator name="Performance" class="Performance">
            </operator>
        </operator>
    </operator>
</operator>

Is there something which I do in a wrong way?
Thanks for help
Annica

Answers

  • haddockhaddock Member Posts: 849 Maven
    Hi there,
    I thought that, if i apply this function this outliers are going to be ignored for further calculations
    .

    If you want to save time it is really important in RM to read what little documentation is provided, and not to make assumptions. Actually this operator does not filter out the outliers, it just adds an attribute to indicate whether each example is an outlier, much as the info for the operator indicates..
    The Operator takes an example set and passes it on with an boolean top-n D^k outlier status in a new boolean-valued special outlier attribute indicating true (outlier) and false (no outlier).
    To see the point, just put a break after your outlier detection operator, and you'll see the new column. I realise that the jargon may all seem a bit confusing, but it does get easier  ;)

  • earmijoearmijo Member Posts: 270 Unicorn
    Just add an "ExampleFilter" operator after the Oulier Detection operator:

        <operator name="ExampleFilter" class="ExampleFilter" breakpoints="after">
            <parameter key="condition_class" value="attribute_value_filter"/>
            <parameter key="parameter_string" value="Outlier=false"/>
        </operator>
Sign In or Register to comment.