Options

"Studio hanging on a large dataset with 3,000,000 rows"

MarlaBotMarlaBot Administrator, Moderator, Employee, Member Posts: 57 Community Manager
edited May 2019 in Help
A RapidMiner user wants to know the answer to this question: I am trying to build a model from the 311 Explorer data here https://connect.edmonton.ca/#!/view-data. My model gets stuck at about 5% and I think Studio may be crashing. Any ideas?

Answers

  • Options
    twentworthtwentworth Member Posts: 8 Contributor II
    I tried the data in Auto Model and it seems to be hanging for me as well. 
  • Options
    varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    @twentworth I faced this issue earlier. can you check RAM usage of your PC by RM. Do you see its using all the memory?

    Thanks
    varun
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

  • Options
    twentworthtwentworth Member Posts: 8 Contributor II
    @varunm1 CPU is pegged but memory is fine. I'm on a Mac. 
  • Options
    varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    @twentworth Can you tell what you are trying to do in automodel. This works fine for me in windows. Might be a specific erro.
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

  • Options
    twentworthtwentworth Member Posts: 8 Contributor II
    This error is from @taghaddo, maybe he can chime in?
  • Options
    taghaddotaghaddo Member Posts: 6 Contributor I
    I am trying to using prediction model , but it hangng in 13% progress of KNN
    and 5% of linear reg

  • Options
    taghaddotaghaddo Member Posts: 6 Contributor I
    it is using full CPU 
  • Options
    varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    @taghaddo Can you post your XML code here? To copy XML code, you should go to View --> Show Panel --> XML. Then copy whole code and paste it here. Also, I just want to confirm the size of the dataset, I downloaded it and it shows 360K samples and not 3 Million am I correct?
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

  • Options
    taghaddotaghaddo Member Posts: 6 Contributor I
    yes, 3K.
  • Options
    sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager
    edited February 2019
    So this is a 100MB csv file so I would expect a standard desktop/laptop to bog down here. And which feature are you trying to predict? I tried to predict "Service Code" (why not?) using Auto Model. Only Naive Bayes is recommended as the others are too resource-intensive.

    @taghaddo can you please post your XML so we can see your process?

    Scott

    [EDIT: the runtime for even NB will be a long time in RM 9.1 - but very fast in RM 9.2 :wink: ]
Sign In or Register to comment.