RM 'PERFORMANCE' on 16 cores 60GB ram

Hi there

-Problem and setup:
I have sorta benchmarked RM on SUSE linux (SLES) on 16 Xenon 3GHz cores and 60GB of ram,
Compared it to my laptop running win7 64 on U9400 and 8GB ram. Used the learner RM sample #21(leave 1 out),
all parallelization either ON or all OFF.

There was about 75% speed improvement on the suse cluster over my 2core ultralow voltage intel system.   :(

The parallelization of the operators had no statistically significant effect on either of the platforms.  :-\

a) err.... what is the parallelization function good for? There is absolutely no improvement over the regular run.
b) 32 Xenon 3GHz threads and 60GB of ram over 4 threads on U9400 2.2GHz 8GB and the speed up is JUST 75%?!

What is the benefit of task parallelization that does NOT work  ???

Why is RM showing performance increase <1 between two systems orders of magnitude of flops apart?

