Options

large data sample handling issue

bingojosjtubingojosjtu Member Posts: 5 Contributor II
edited November 2018 in Help
Hi

I encounter a problem recently when using outlier detection funcition (LOF to be specific).

Condition:
My data sample is about 178000 in total samples and around 10-12 attributes.

My computer has 8 GB RAM and i7 2600 CPU. Hard disk enough space.

Scenario:
I let the program run overnight, but the next morning, the program says that it can not handle the process and the computer memory is too small for this task.Β 
it stopped at outlier detection step, which I know is a very slow process but I did not expect it refuse to complete due to memory size.

Question:
My question is, for a given sample size and attributes number, how am I suppose to know the memory requirement or say upper limit of a particular procedure before hand?

Q2: Is there any way to solve this issue other than shrink my data sample size at current stage?

Q3: What if I increase my RAM to 16 or 32 GB, does it help to solve the issue?

BTW, I have submitted the job on the cloud server (32 GB version), hope with the help of your computation source, this issue can be solved.

Thank you!

RMer
Tagged:
Sign In or Register to comment.