Slow Performance Issue with Rapid Miner Outlier Detection

shredlegend88shredlegend88 Member Posts: 10 Contributor II
edited June 25 in Help

I have a recordset of just over 10,000 records with 8 columns and I tried using the outlier detection operator and it is taking a very long time to run.  I have tried the different outlier detection methods (LOF, COF, etc.) and tried different number of neighbors and other optional tweaks.  I tried allocating more RAM to the Java process, set the java process to high priority, but nothing seems to have an impact.  I wouldn't think it would take so much for such a small dataset.  I have the educational licensed version if that helps.  

 

If anyone has suggestions on improving the performane of this particular operator, much would be appreciated.

Tagged:

Answers

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,131  RM Data Scientist

    Hey,

     

    did you use the outlier extension and where dates included?

     

    ~Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • shredlegend88shredlegend88 Member Posts: 10 Contributor II

    I am not sure if the base studio product came with extensions included, but I just used the Outlier Detection operator and no dates were involved, mostly dummy variables and a few continuous variables.

  • shredlegend88shredlegend88 Member Posts: 10 Contributor II

    I just let it run, and it took about 10 minutes or so. Which is fine if I walk away from it, I just thought it was weird to be so slow for such a small dataset for an enterprise data mining product.

Sign In or Register to comment.