RapidMiner 9.8 Beta is now available

Be one of the first to get your hands on the new features. More details and downloads here:

GET RAPIDMINER 9.8 BETA

correlation

ThiruThiru Member Posts: 76  Guru
hi all,  

I m working on a classification problem  and I would like to know, any correlation between rows. 

the application here is , classification of fault status of machine. 

 if I consider rows with  Label1 -  representing fault free state of the machine,  and label 2-  representing faulty state of the machine, and if these rows are  correlated, my classification algorithm like kNN is not performing in classifying them.

so, I would like to find correlation between them and determine the value - to study the reason behind the non performance of selected algorithm.

thanks

thiru


Answers

  • David_ADavid_A Administrator, Moderator, Employee, RMResearcher, Member Posts: 251  RM Research
    Hi @Thiru ,

    taking a look at the correlation is a very good starting point.
    There are several operators that can calculate different correlation measures, for example:
    • Correlation Matrix, to give you an overview of the correlation between all attributes
    • Weight by Correlation, to get the correlation between attributes and a label
    • AutoCorrelation, if you think there might be a time dependent component in your data
    Also AutoModel is already taking care of correlations and might give you some hints where to look at.

    Best,
    David
  • ThiruThiru Member Posts: 76  Guru
    David_A ,

    thanks for your reply.  However my query was with respect to rows. 
    To explain further,  how to differentiate the values of rows of faulty state  vs rows of fault free state. 

    I suspect one of my cases, the algorithm is not working
    because of the correlation between  attribute values with respect to class 1 and
    the same attribute values but with respect to class 2 .   Is there any way to calculate these.

    regds
    thiru


  • BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 474   Unicorn
    Hi Thiru,

    correlation the way we understand it is a two-dimensional concept: attribute (column) A and Attribute (column) B are said to be correlated if the values of A and B in the same rows are correlated (if A is large, B is large). 

    It would be possible to flip the axis and calculate the correlation that way but I'm not sure if you mean this. If you do, select the two examples (rows) e. g. with Filter Examples or Filter Example Range and use Transpose to turn the rows into columns.

    Regards,
    Balázs
Sign In or Register to comment.