The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.
Options

Outlier Detection

annicaannica Member Posts: 1 Contributor I
edited November 2018 in Help
Hello
I try to use an outlier detection like distance based outlier detection.
I thought that, if i apply this function this outliers are going to be ignored for further calculations.
But if I apply this function or not; I can not see any differences.
this is my xml part:
<operator name="Root" class="Process" expanded="yes">
    <parameter key="logverbosity" value="warning"/>
    <operator name="Red Wine Example Data" class="ExampleSource">
        <parameter key="attributes" value="/home/annica/rm_workspace/projekt/wine.aml"/>
    </operator>
    <operator name="DistanceBasedOutlierDetection" class="DistanceBasedOutlierDetection">
        <parameter key="number_of_neighbors" value="2"/>
        <parameter key="number_of_outliers" value="14"/>
    </operator>
    <operator name="XValidation" class="XValidation" expanded="yes">
        <operator name="Training" class="OperatorChain" expanded="yes">
            <operator name="W-SMO" class="W-SMO">
            </operator>
            <operator name="ModelWriter" class="ModelWriter">
                <parameter key="model_file" value="/home/annica/rm_workspace/wineModel.mod"/>
            </operator>
        </operator>
        <operator name="Testing" class="OperatorChain" expanded="yes">
            <operator name="ModelApplier" class="ModelApplier">
                <list key="application_parameters">
                </list>
            </operator>
            <operator name="Performance" class="Performance">
            </operator>
        </operator>
    </operator>
</operator>

Is there something which I do in a wrong way?
Thanks for help
Annica

Answers

  • Options
    haddockhaddock Member Posts: 849 Maven
    Hi there,
    I thought that, if i apply this function this outliers are going to be ignored for further calculations
    .

    If you want to save time it is really important in RM to read what little documentation is provided, and not to make assumptions. Actually this operator does not filter out the outliers, it just adds an attribute to indicate whether each example is an outlier, much as the info for the operator indicates..
    The Operator takes an example set and passes it on with an boolean top-n D^k outlier status in a new boolean-valued special outlier attribute indicating true (outlier) and false (no outlier).
    To see the point, just put a break after your outlier detection operator, and you'll see the new column. I realise that the jargon may all seem a bit confusing, but it does get easier  ;)

  • Options
    earmijoearmijo Member Posts: 271 Unicorn
    Just add an "ExampleFilter" operator after the Oulier Detection operator:

        <operator name="ExampleFilter" class="ExampleFilter" breakpoints="after">
            <parameter key="condition_class" value="attribute_value_filter"/>
            <parameter key="parameter_string" value="Outlier=false"/>
        </operator>
Sign In or Register to comment.