Should I give up yet?

Legacy UserLegacy User Member Posts: 0 Newbie
edited November 2018 in Help
Hi everyone,

I'm very interested in datamining using rapidminer, though, I feel that I am ready to give up. I'm enjoying learning rapidminer (I've seen some excellent blogs/videos), but I'm just not a stats guy or statistician by any means... I think I know the basics but in no way feel that I can actually run/solve business scenarios and present to the boss.  

Many videos are great in showing "how to use" rapidminer, however, I haven't seen much by way of explaining the scenario results, or the statistics behind arriving at the results.  

Q: how much of a statistician must one be to apply business scenarios to rapidminer?  I've reviewed several stats books, but just seem overwhelming.

Any suggestions, comments, directions would be much appreciated.  



  • Options
    JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 578 Unicorn
    "I'm just not a stats guy or statistician by any means." - Welcome to the club!    8)
    & certainly don't give up yet. 
    I believe what you're asking is: "How can I use datamining to help you solve a business problems and present that to management in a way that they understand so they can make/save money and then heap praise & wealth upon me?" 
    - There are sadly no off the shelf guides for this yet.  (Perhaps the fabled RapidMiner Textbook will show this, Ingo? :))

    However, as a starting point I'd recommend studying CRISP-DM.  It's a simple methodology for approaching datamining projects which I find really helpful & doesn't need any statistical background.  (Yay!)  Here's a link to a pdf with info on it, but there's plenty of other resources on it out there.  http://www.the-modeling-agency.com/crisp-dm.pdf
    When you are looking at the first phase of Business Understanding think about how you want it to look when it's deployed (Presented).  You might find something different than you expect & need to change things as you go on, but if you have a framework to hang the datamining on it's much less daunting than going blind into loads of data to 'find some insights'. 

    With many of the algorithms in modelling I find it's very difficult to explain to management how it works & they don't really want to know.  I'm studying as much as I can understand to learn for my own benefit, but in the main people want something that works for them.  Be it a better dataset for a marketing campaign, a new category of customer or a pretty graph showing how much revenue they can expect to achieve by the end of the year - most don't really care how it was created, as long as it works. 
    Pro tip: have some formulae books open on your desk & learn the wikipedia explanations for some of the models you use in case the boss does asks you for more detailed info on how it works, if you can make their eyes glaze over as you talk then you've succeeded! 

  • Options
    robertrobert Member Posts: 14 Contributor II
    I took a quick look at the CRISP-DM PDF, and the material looks dry, methodoLOGY-like, and does not seem to contain intuitions and explanations (maybe I gave up too quickly). Indeed it looks like what the introduction describes: an attempt in year 2000 to formalize data processing steps. Hats off to those who find it useful, but it does not feel like an inspired piece of work.

    Also, if some domain like data mining requires the knowledge of some statistics, it would be not optimal to ignore statistics. I would rather recommend lecture videos from Khan Academy and the like, on statistics (obviously), probability calculations, linear algebra and calculus, one lesson at a time. There are video lectures from other sources too.  Do a lot of browsing on the Internet for the particular domain of your work, and you'll find not only research papers that are over your head, but also papers, guides etc. that give some domain specific insight and best practices.

  • Options
    Legacy UserLegacy User Member Posts: 0 Newbie

    While not sure if there'll be any "heap praise & wealth" part to it, I do appreciate your thoughts and for sharing the CRISP-DM doc link.  I took a glance at it, and seems to be a great resource for getting one's thought process organized. I appreciate you sharing your approach as well.

  • Options
    Legacy UserLegacy User Member Posts: 0 Newbie

    Thanks for your reply...  points well taken.  No question, there is quite a learning curve.

Sign In or Register to comment.