Anova and Anova matrix

PennocaPennoca Member Posts: 6 Contributor II
edited November 2018 in Help
Hi All,



I was playing with group anova and anova matrix operators. I am not able to get where the difference are using these operators, apparently only that there is a difference. how can I get rapidminer to tell where the differences are and the p values for those differences in a data set containing both polynominal and numeric attributes?



Many thanks, Penoca.

Answers

  • yyhuangyyhuang Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 364 RM Data Scientist

    Hi Pennoca,

     

    Do you have any sample data for ANOVA test?

    The built-in tutorial process for 'Grouped ANOVA'

    anova.PNG

    gives an example to perform a two-sample t-stest on the Golf data, by comparing the mean value of the numeric attribute (Humidity) for a group factor play (yes or no)

    Please note that the grouping attribute should be nominal and the anova attribute should be numerical.

     

    The difference between 'Grouped ANOVA' and 'ANOVA Matrix' is also explained in the help view,

    help-anova.PNG

    If you are more interested in other traditional statistical hypothesis testing, like one-way ANOVA or MANOVA, I would suggest to use 'R scripting' (install R scripting extension from RapidMiner market place for fee) to combine the power of R with RapidMiner. Quick reference

     

     

  • PennocaPennoca Member Posts: 6 Contributor II

    Hi YYhuang,

     

    thank you for your answer.

    I have R installed but I am still learing how to code, therefore I was expecting rapidminer to be an alternative.

     

    I included a data set. Basically I want to compared  the means of concentration in all possible comparisons. For example, Ala Control 0h vs Ala Control 24h, Ala Control 24h vs Ala Treated 24h.

     

    I do not wish to compare metabolites, for example Ala vs Cit.

     

    I know R can do this easily.

     

    Many thanks, Best regards, Pennoca.

     

    Metabolites Treatment Time points Concentration
    Ala Control 0h 1
    Ala Control 0h 2
    Ala Control 0h 3
    Ala Treated 0h 1
    Ala Treated 0h 1
    Ala Treated 0h 2
    Ala Control 24h 1
    Ala Control 24h 1
    Ala Control 24h 4
    Ala Treated 24h 8
    Ala Treated 24h 9
    Ala Treated 24h 10
    Cit Control 0h 1
    Cit Control 0h 2
    Cit Control 0h 3
    Cit Treated 0h 1
    Cit Treated 0h 1
    Cit Treated 0h 2
    Cit Control 24h 1
    Cit Control 24h 1
    Cit Control 24h 4
    Cit Treated 24h 8
    Cit Treated 24h 9
    Cit Treated 24h 10
  • PennocaPennoca Member Posts: 6 Contributor II

    Another questino on top,

     

    Can we use the package ggplot in rapidminer to make R visualizations?

     

    Thanks

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist

    Dear Penooca,

     

    sure. I am more the python guy and use python's matplotlib for viz. But you can do the same thing with R. Just install the R extension and use Execute R.

     

    The blog article i wrote last month on hearthstone is using matplotlib. You can download the data and processes from on the rapidminer.com blog

     

    ~martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.