The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.

Options

# outlier in each class

Hi.

I have two question to use LOF in rapidminer:

1-outlier methods can be used for all data regardless of their labels , while I want to do that in each class separately , I mean each time LOF is calculated in just one class of data

2- How can I use LOF result to filter outlier in Rapidminer?

2.1-output of the LOF is number that it should be analyzed to remove outlier points. the points with the high LOF can be detected as outliers, provided the relative distance of them with other point differ drastically

2.2-To decay masked outlier effects and finding more potential outliers, LOF should be calculated iteratively after removing the potential outlier points

To answer the question of my questions I have written a program in R, but I wonder how can do that in Rapidminer, because time performance of finding LOF in Rapidminer is more than R due to multithreading

Regards

REZA

I have two question to use LOF in rapidminer:

1-outlier methods can be used for all data regardless of their labels , while I want to do that in each class separately , I mean each time LOF is calculated in just one class of data

2- How can I use LOF result to filter outlier in Rapidminer?

2.1-output of the LOF is number that it should be analyzed to remove outlier points. the points with the high LOF can be detected as outliers, provided the relative distance of them with other point differ drastically

2.2-To decay masked outlier effects and finding more potential outliers, LOF should be calculated iteratively after removing the potential outlier points

To answer the question of my questions I have written a program in R, but I wonder how can do that in Rapidminer, because time performance of finding LOF in Rapidminer is more than R due to multithreading

Regards

REZA

0

## Answers

2,531Unicornthe first question I answered in one of your other posts. This will work exactly the same way here.

To your second question: You could use an ExampleFilter to filter out all examples with an outlier factor beyond a threshold. The Iterating operator chain will allow you to execute it several times. You might need storage and retrieving again.

Greetings,

Sebastian

35MavenFirst of all thanks

about second part it seems more to clarification:

let's suppose lof is: 5.3,5.0,4, 2,1.95,1.94,...1. it is clearrthree first numbers 5.3,5,4 are by far further than other point 2,1.95,1.94,.......1:

so we can regard to 5.3, 5, 4 as outlier

but to automation:

1.defining as a factor in relative dist threshold of outliers:t (here t=2) and minimum distance mt( here2)

1.calculate relative distance between points: d= .3,2.3,1,.05,.01,..............,01:

if di>=t*d(i+1) and here d(3)=>2*d(4) 1=>1 and lof(4)>=mt (mt=1)

d(i+1)<t*d(i+2) and lof(i+1)<2

d(i+2)<t*d(i+3) and lof(i+2)<2

......

then point 1:i are outliers.

I don't know how can i define relative distance of neighbor points and compare them as above

regards

REZA