The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.
Options

Outlier detection operators seem to work really slow with larger data sets

lanemlanem Member, University Professor Posts: 29 Maven
edited November 2018 in Help

Hi

I have a data set of about 160,000 and 25 attributes - trying to detect outliers for numeric variables using detect outliers operators but seems to take for ever to run and sometimes simply runs out of memory

Any advice on a more efficient way to identify outliers in a data set using RapidMiner Studio would be much appreciated

Regards Michael

Answers

  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Have you downloaded the Outlier Detection extension? Those operators are very fast and have many more than the core RapidMiner ones. 

  • Options
    lanemlanem Member, University Professor Posts: 29 Maven

    Hi Thomas

    When I search in Market place for Outlier Detection extension doesn't return any values - am I using the wrong search term - I do have the anomaly detection extension installed

    Regards Michael

  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    AH I meant the Anomaly Detection extension.  Ok, so you have it installed already. My guess is that the memory available to RapidMiner is not enough. How much do you have and what is your license type? Community? Educational? 

  • Options
    lanemlanem Member, University Professor Posts: 29 Maven

    I have 16GB memory and using educational license of rapidminer

  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Can you break it into subsets and iterate over that?

Sign In or Register to comment.