The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.
Options

T-Test

sudheendrasudheendra Member Posts: 22 Maven
edited November 2018 in Help
Hi All,

How to start exploring with T-Test and Anova? I would like to use performance vector for my sample dataset.which one will be better to start with.

Regards
Sekhar

Answers

  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    you have to provide two performance vectors and then apply the T-Test or Anova operator on it. It will give you the probability, that they are not sigificantly different.

    Greetings,
      Sebastian
  • Options
    crappy_vikingcrappy_viking Member Posts: 16 Maven
    Hi,

    About Tests, would you have a look here ? http://www.statsnetbase.com/ejournals/books/book_summary/summary.asp?id=1658
    The "henry" normality test seems interesting because it can help in deciding whether to use Neural models or linear models. If you know other normality tests, let me know...

    C.V.
  • Options
    AnneGAnneG Member Posts: 8 Contributor II
    Hello there,
    I guess this question fits in here ... For quite a long time I have been trying to figure out how to perform t-tests on my data. I hope you will give me a hint on this. I have three classes and I want to compare (in pairwise manner) each attribute within one class with another class. So, is there a significant difference between class 1 and 2 concerning attribute A and so on. Thus, this is actually no comparison of performances, but I know that I need performance vectors as input for the t-test operator. However, I cannot think of a way of getting to this performance vector from my example set.

    Greetings,
    Anne
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    we have the AnovaMatrix doing something similar. It calculates the significance of difference between the values of all numerical attributes, based upon a grouping defined by all nominal attributes.

    Might this help?

    Greetings,
      Sebastian
  • Options
    AnneGAnneG Member Posts: 8 Contributor II
    Hello,

    yes ANOVA is alright, however, as soon as there are more than two classes to compare, ANOVA does not tell you WHERE the differences lie. Say, we have attribute X where class 1 and 2 are different, but there is no difference between 1 and 3 and between 2 and 3. That is why one would need post hoc tests such as Scheffé or Bonferroni.
    But I see, the t-test and anova operators in RapidMiner are not meant to be used like this. In any case your tool is still great ;-)

    Kind Regards,
    Anne
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    feel free to implement such a operator. I will be happy to put it into the core and provide it's functionality to all users. Shouldn't be to much of a problem, if you know the algorithm.

    Greetings,
      Sebastian

Sign In or Register to comment.