Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

"slow on high end machines"

the_duckmanthe_duckman Member Posts: 6 Contributor II
edited May 2019 in Help
hi,

I have  bunch of data to process and after being generally happy with performance on a quad core system, but still needing more grunt I purchased 2 high end systems.

one is a 6 core amd 1090t
the other is the new 8 core intel thing (cost nearly $1000 for the cpu alone)

Both machines have 8gb of ram and both are running win 7 ultimate. The problem is they both tank at running rapid minder (the old quad core is faster). The same problem seems to occur (using the parallel plugin) I cannot get a good CPU utilization for feature selection, x-validation, a-nn, Knn etc. The intel is 12% max and the AMD is 18% max.

I never had this problem on the quad core vista box....
I feel like I have thrown away a lot of money as the machines stay almost idle.

Any ideas or advice would be much appreciated.
Tagged:

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    seems to be you are just utilizing one core. Do you use the same process and java version as on the quad core? Is the cpu utilization 100% there?

    Might be you should think about subscribing to our RapidMiner Support to ensure that the cost for the hardware aren't wasted...

    Greetings,
      Sebastian
  • the_duckmanthe_duckman Member Posts: 6 Contributor II
    It does sound like one core, but all cores report even(ish) utilization. So it looks more like 1 thread.

    I installed ubuntu on the AMD box and ran the java version of rapid miner, this improved the utilization. But still inly 34%.

    I don't know about paying for a remote support service. If a product is to difficult for me to maintain, I find something easier.

    What would hit the spot would have been openly priced turn key product solutions.
    eg, purchase computing time on hosted rapid analytics servers, so I can just use the run-remotely option on a server elsewhere.
    or, a rapid miner rapid analytics virtual appliance that I can just buy online (shopping cart, not esquire for more info) and run with vm-ware viewer, no setup, no messing around...
    That I would not have hesitated to buy.

    Anyway, It seems the common factor is going >4 cores, as I tried some other systems with < 4 cores. I know java threading can be a nightmare to support in the wild. Sometimes a little option box to use alternate threading can let people sort out the issue for them selves.  Anyway, i am about to try several JRE's, willlet you know the results.
  • the_duckmanthe_duckman Member Posts: 6 Contributor II
    On limited testing, mainly in feature selection i found yhe open jre was faster than the sun version by20%. This is mainly attributable to the open jre having better core utilization as is indicated by my monitoring utilities.


    Still no "good" performance in >4 core systems.
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    unfortunately I cannot go deeper in this matter right now, but of course our support would be able to help you. Once could for example establish a webEx Session, so that you can turn over the control to us so that we are able to run diagnosics.

    And of course we could offer you a hosted rapid analytics server ready for use. Just contact me for details or a quote. But there are only few companies out there, willing to ship their data over the public internet, even over https secured connections.

    Greetings,
      Sebastian
  • dragoljubdragoljub Member Posts: 241 Contributor II
    Why don't you spawn 100s of threads. It will easily max out any cpu. If you have enough ram to keep say 32 learning operators in memory by all means run 32 threads in parallel.

    I am using a LSF load sharing facility and can run 100s of threads no problem...

    -Gagi
  • dan_agapedan_agape Member Posts: 106 Maven
    Hi Duckman,

    Can you please keep us posted when you will have solved your problem.

    Thanks,
    Dan
Sign In or Register to comment.