Cache for ExampleSets?

harri678 · April 2010

Hi,

I have been wondering if there is any chance of caching the ExampleSets between multiple runs. In my case, the loading of the sparse data files takes lots of processing time every run but the data files do not change. So some kind of caching would be great to speed things up? Has this already been discussed or is there another solution to avoid reloading sparse files every run beside sql?

Greetings,
Harald

land · April 2010

Hi Harald,
did you try to save it into the repository? Might speed things up a lot...
Caching is in fact an issue, but this is not planned for the client version of RapidMiner.

Greetings,
Sebastian

harri678 · April 2010

I made a little benchmark and the "Read AML" of a sparse file is faster than store/retrieve repository.
sparse-file-specs: 7200 examples, 155340 attributes (16Mb .dat, 11Mb .aml, approx. 90% sparse)

I use "Read AML" and "Store" to save the data into the repository and made several loading-only tests to eliminate caching. These are the results:


          Retrieve Repo     Read AML (sparse)
1. run:   02:10             00:18
2. run:   02:03             00:19

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Cache for ExampleSets?

Answers