Gaussian Naive Bayes Formula on RapidMiner

ikayunida123ikayunida123 Member Posts: 17 Contributor II
edited September 2019 in Help

Hello everyone!

I know that RapidMiner is using Gaussian distribution in Naive Bayes. But after I compare my result that I count manually and my result on RapidMiner, it's really different. So I am wondering maybe RapidMiner uses a different formula or I just count it wrongly.

I use this formula to count the mean : 1/n*(sum of xi), and this one to count the variance : 1/n-1*sum of(xi-mean)^2.

I want to know what's the formula that RapidMiner uses to count Gaussian NB? Is it just same with the formula that I use above?

Thank you.

Tagged:

Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn

    Hi @ikayunida123,

     

    You can find an Excel file used to calculate the probabilities from the "Golf" dataset using NB formulas by following this link.

    If you obtain differents results is maybe because RapidMiner calculate by default the probabilities with Laplace correction

    and you without Laplace correction.

     

    Regards,

     

    Lionel 

  • ikayunida123ikayunida123 Member Posts: 17 Contributor II

    Hello @lionelderkrikor . It's works nicely, thank you.

    But some data still have different result. For example some of the standard deviations, in RapidMiner they display it as 0,001, but in Ms. Excel it's come out as 0. I wonder if RapidMiner and Ms. Excel have different way to count it (?)

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn

    Hi @ikayunida123,

     

    Have you set the number of digits after the decimal point to 3 or more in Excel ?

     

    Regards,

     

    Lionel

  • ikayunida123ikayunida123 Member Posts: 17 Contributor II

    @lionelderkrikor Yes, I have set the type in format cells into number and added several decimal places. But the result still the same. I tried to browse the formula on other website, and people said that Ms. Excel is using Bessel's correction to count the standard deviation.

Sign In or Register to comment.