**🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉**

### Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.

## CLICK HERE TO DOWNLOAD

# Correlation Matrix

Hi,

I have a large data set with many attributes

I would like to see how closely the attributes are correlated but because of the sheer number of them I'm only interested in attributes that are correlated about 40%

Is there a way to do this for example using a filter of some description. I know you can remove correlated attributes and select by weights but are not what i need as im interested in the high correlations

Thank you for your time

I have a large data set with many attributes

I would like to see how closely the attributes are correlated but because of the sheer number of them I'm only interested in attributes that are correlated about 40%

Is there a way to do this for example using a filter of some description. I know you can remove correlated attributes and select by weights but are not what i need as im interested in the high correlations

Thank you for your time

Tagged:

0

## Answers

458UnicornThere are options like "top k" and "top p%" in the Select by Weights operator that might help.

regards

Andrew

28Contributor IIThanks for the quick reply. I ran it this morning but i don't think this is what I'm looking for

What i need is the pairwise table so i can specifically say there is a 50% correlation between Attribute A and B but a Negative correalation between A and C

Do you know if you can filter the actual matrix?

Thanks

28Contributor IIIs there perhaps a method to export the pairwise table into a CSV file or generate a report based off of it?

Has anyone tried it before

If it was in a database it would be simple case of selecting the rows where the correlation is above a certain amount

Thanks

458UnicornA groovy script would be able to do it. I could probably do that in return for beer or money ;D

Alternatively, I'm having a think about the possibility of calculating the correlation in a process without using the built in operators. That way would let you make an example set that could be filtered as you like.

regards

Andrew

28Contributor IIBut unfortunately, it doesn't provide a pairwise table and the matrix in question is 5000 attributes in scope so exporting it to excel means cutting off a good portion of it

Il keep the beer money in mind of course , as soon as the next pay check comes around

1,869UnicornHave a look at process below:

28Contributor IIThis works

However i have one last problem in relation to this

My pair wise table is going to generate roughly 25 million rows which is not exportable using a report

Is there anyway to filter the matrix/pairwise table so that say only attributes with a certain correlation are exported for example only return attributes with 50% or more correlation?

Thanks

1,869Unicorn28Contributor II