Options

How to use outliers detection in different attributes (not at the same time)

dfischerdfischer Member Posts: 2 Contributor I
edited November 2018 in Help
Hi,

I'm using RapidMiner 4.6. In the SAMPLES directory, I found under the 03_Preprocessing folder a file called 18_OutlierDetection.xml.

I was wondering if I could (1) use outliers detection in the first attribute, (2) and store the output in a attribute called outliers_att1 (don't filter, only classify because I want to preserve all rows for the next step) (3) move to the next attribute and run outliers detection on this column only (4) and store the output in a attribute called outliers_att2...and in the END filter only non-outliers data.

Is that possible? Notice that I don't want to run outliers detection on all attributes at the same time and I don't want to filter each step. The filter should be used only in the end.

Thank you,

Fischer 

Answers

  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    this is possible but needs a more complex process setup. You will need a FeatureIterator, which will run over all attributes and copy the name of the attribute into a macro. This macro might be used to first calculate the outlierness of each example on only one attribute by using an AttributeSubsetSelector. The outlier detection has to be made inside the AttributeSubsetSelector.
    After this operator, you will have the complete example set with an outlier attribute. This can be renamed using macro into something like attributeName_outlierness.
    I think you will get the idea, if you read the documentation of all mentioned operators.


    Greetings,
      Sebastian
Sign In or Register to comment.