Options

Data Mining PC and Benchmarking

MBA_Data_MinerMBA_Data_Miner Member Posts: 21 Contributor II
Howdy folks,


I am interested in what kind of hardware different users are running data mining software on. I have been benchmarking different hardware and am very interested in comparing notes with other forum members. Additionally, do you have any recommendations for data mining computer specs? Are you finding that higher end systems are substantially faster?
My results so far have a late model MacBook Pro as the fastest machine out of a few notebooks tested. I am also testing a desktop for comparison.

Please advise and comment,

Best regards, J.

Answers

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,507 RM Data Scientist
    Hi,

    We consultants of rapidminer have kind of standard Lenovo thinkpads. The only remarkable thing is that we have quite some memory attached. The thing is, that if you really need to do something with alot of CPU/RAM load you simply switch over to RM Server / cloud. I personally have basicly all of my processes located at servers.

    Cheers,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Options
    MBA_Data_MinerMBA_Data_Miner Member Posts: 21 Contributor II
    That does help. Alas the probability of me having access to a server in the near future is low, so I will have to work with single systems. I have tested a couple different computers now, the MacBook actually beat out everything so far. Second place goes to a Asus G751 gaming laptop so far as well. The Asus has 24 GB ram, seems to help quite a bit.

    I have one last system to test, a circa 2012 custom gaming desktop. It has a great processor but less RAM than the Asus.

    The process being tested is a balanced and binned dataset of 5000 examples. Parameter optimization is run around a 10x cross validation of a decision tree. 5 Decision tree parameters are optimized (grid, parallel).
  • Options
    MBA_Data_MinerMBA_Data_Miner Member Posts: 21 Contributor II
    Just a thought- It would be amazing if Rapidminer created a simple benchmark program for users to run with a few different size datasets and algorithms to test. Think along the lines of CPU benchmarking programs, but specifically for data mining. It would be great to be able to test and upload results for different hardware/OS configurations and share them with people around the world. I use benchmarks published online a lot for comparing PC performance.

    Any comments or thoughts on this?
  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,507 RM Data Scientist
    Hi,

    honestly performance is not the point. Most users will have a server somewhere and use it if it goes CPU heavy.
    Regarding decision tree: Did you use RM version 6.3+? We optimize the Decision Tree in version 6.3.

    Your account is registered to a .edu address, are you areware of our academic programm? That gives you the opportunity to get a server license. A server can run on any machine running java (linux/mac/windows).

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Options
    MBA_Data_MinerMBA_Data_Miner Member Posts: 21 Contributor II
    I wasn't sure if I could even use the academic program because of my role, as I am an employee/institutional analyst for a higher education institution (rather than a student or professor).  I've been using the free community edition (5.3) of studio.
Sign In or Register to comment.