What is the maximum amount of rows

Jeffersonjpa · January 2019

What is the maximum amount of rows you have already imported into the rapidminer? 10 million ?

David_A · January 2019

You mean, what's the largest data set you can work with?
That would highly depend on your available hardware (storage space, RAM, ...) but other than that, there is no limit (considering you don't hit your license limit). On my travel laptop with only only 8GB of RAM, I could easily create a test data set with 10 million rows of random data.
But of course if you actual start working with the data, the memory requirements and practical run time limits are more complex.

I hope that helps.

sgenzer · January 2019

hi @Jeffersonjpa I don't think you're really going to get an answer to this question

Almost all of our customers use proprietary data and hence we are not able to give you what you're looking for. I can, however, share this example of just how powerful the platform is - given enough resources. It is a from an unnamed commercial customer running real data:

Dataset: 1.5m examples (rows), 49 attributes (columns) of which 5 were nominal and 44 were numerical
Hardware: cluster of 64 AMD Opteron 6380 chipsets (16 cores each, 2.5MHz), 504GB RAM with 384GB swap

Generalized Linear Model (GLM): runtime = 1 min 21 sec
Deep Learning (H2O implementation): runtime = 7 min 29 sec

User reported that all CPUs were "pegged" during this run with up to 180GB being consumed at times.

Does this help? It's one example. You can have another data set with the same rows and columns that produces very different runtimes due to what those rows and columns contain. All I'm trying to share is that RapidMiner will use pretty much whatever resources you throw at it.

Scott

David_A · January 2019

Hi,

it depends on the type of license you are using.
If you have a (30d ays) trial or educational license, there is no limit of rows.
The regular free license, has a limit of 10k rows and the commercial (paid) versions scale up from that limit, up again to unlimited rows.

Best regards,
David

Jeffersonjpa · January 2019

but what is the maximum number of rows you have already imported into production? I would like real examples.

David_A · January 2019

Do you need a single number (like in extreme use case) or as an average (like in a survey of the typical size of data).
As mentioned, a single maximum number (especially reduced only to the number of rows, without number of columns and applied algorithm) does not bear a lot of information.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

What is the maximum amount of rows

Best Answers

Answers