Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

How to calculate multiple correlation matrices by looping through subset (based on feature values)

12345671234567 Member Posts: 1 Learner III
edited November 2018 in Help

Dear community,

 

I have a data set that contains student names, their marks in different courses and the number of times they showed up for classes, like e.g.:

 

Max Mustermann | Database Technology | 2.3 | 10

Max Mustermann | Computer Science Intro | 1.3 | 12

Maria Musterfrau | Computer Science Intro | 2.0 | 13

...

 

So Max showed up 10 times at DB and got a 2.3 (mark) in the end.

 

Now, for each student in the data set, I want to calculate the correlation coefficient between mark and number of times attended.

 

The output should look like this:

 

Student name | Correlationcoefficient between mark and number of times attended (considering all courses the respective student took)

Max Mustermann | 0,711

Maria Musterfrau | -0,312

...

 

Everything works fine, except that I don't know how to best do the looping and producing the final table (e.g. using "correlation matrix").

 

I would appreciate your help. Thanks in advance and kind regards.

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    So your loop calculates the average and # of attendance for each student?  Then take the resulting exampleset and connect a correlation matrix operator to it.Out put the MAT port

Sign In or Register to comment.