tackle large Files

choose_username · May 2010

Hello all,

i have a large Data set (15 Attributes and almost 50.000 records). The Problem is : For example if a use the Operator Detect Outlier, RapidMiner need a very long time to perform it. Is there a Solution to this (I mean without using a different Computer)? Or do i need to look for a new Data set ?

Thanks in advance

User

IngoRM · May 2010

Hello,

well, there is no general answer for this. There simply exist some algorithms which have long runtimes (like neural networks, relevance vector machine and - as far as it seems - also the outlier detection operator). In contrast to other data mining solutions, RapidMiner does not remove such algorithms since they work quite well on smaller data sets (or faster machines

). Actually, there is not much you can do beside

using only a sample of the data
trying different schemes or different approaches for you problem, in this case for outlier detection
check if the algorithm is available in a parallel working mode and use more than one CPU core only
inspect the source code and check if it can be optimized / parallelized which we are than happy to include into RapidMiner if you allow this

Cheers,
Ingo

choose_username · May 2010

thank u for ur fast answer

. i think i will look for another Data set.

greetings

user

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

tackle large Files

Answers