RapidMiner

Find correlation between names and score

Wisdom logo Registration now open for RapidMiner Wisdom Americas | New Orleans | October 10-12, 2018   Learn More
Highlighted
Newbie dbzyko
Newbie

Find correlation between names and score

Hello guys,

 

I'm new to data mining as well as rapidminer and hope you can help me with my task Smiley Happy

 

First some details about my (input data)

I've got an excel table with the following structure:

document_id, name_0,name_1,name_n,score

1234,0,0,1,50.1

1235,1,1,1,70.9

1236,0,0,0,20.5

 

The id is a unique number, the name columns explain if the name_i occures in the data (1) or not (0) (the label of the column is the name of the person) and the corresponding score of the document. as you can see the excel file looks like a vektor.

 

My goal is to find a correlation between names (nominal attribute) and a score (numeric). So if the score of the document is potentialy higher if name_0 or name_1 (or name_i) occures in the coument.

 

When searching in rapidminer for "correlation", the correlation matrix appears but I'm not sure if it is the right tool to work with on this task.

 

Do you have any clue if there are practices to handle this task correctly?

Thank you very much Smiley Happy

 

1 REPLY
Community Manager Community Manager
Community Manager

Re: Find correlation between names and score

hello @dbzyko I think Correlation Matrix is a good place to start. It will give you r (or r^2) values for pairwise features that will give you a sense of things.


Scott

 

Scott Genzer
Senior Community Manager
RapidMiner, Inc.