Options

FP-Growth Takes Excessively long with no results

mflorianmflorian Member Posts: 1 Contributor I
edited July 2019 in Help
I have a basic market basket analysis routine created following the guidelines from the tutorial. When I go to run this analysis against my data set, the process can run for 8+ hours without completing. I am running this on a 64bit Ubuntu box dedicated to just data mining. It has 4GB memory. My startup script sets the max java memory to 2 GB in size, I would think this would have been sufficient for the job. My data only contains 100 rows of transaction data for 256 products (each of which is in their own column). This is my subset, the full dataset is much larger.

When I let it run, the process will climb all the way to the max memory but will not complete. How can I optimize this to complete? The XML for this process is pasted below.



Thanks for any and all help.

-Matt

Update - I added an attribute selector to the process to focus only on positive binomal values and that improved the performance.
Tagged:

Answers

  • Options
    TobiasMalbrechtTobiasMalbrecht Moderator, Employee, Member Posts: 295 RM Product Management
    Hi Matt,

    could you please post the process XML (I do not see it anywhere) and attach some of your data so that we can try to verify your findings.

    Kind regards,
    Tobias
Sign In or Register to comment.