How can i select ,use exploratory data analysis for maximum, minimum values, standard deviation ,
I am confused about exploratory key characteristics of each variable in housing.csv set such as maximum, minimum values, average, standard deviation, most frequent values (mode), missing values and invalid values etc. ,Discuss key results of exploratory data analysis presented in Table and provide a rationale for selecting top 5 variables for predicting median house value (medv), in particular focusing on the relationships of independent variables with each other and with dependent variable median house value (medv) drawing on results of EDA analysis and relevant literature on determinates of house prices
0
Best Answer
-
hbajpai Member Posts: 102
Unicorn
Hey @u1125362 ,
You can use RapidMiner Correlation Matrix operator to visualize the relationship of attributes and label. It look like below.
As far as selecting the top 5 variables is concerned you can use couple various models with explain predictions operator to see model specific dependencies on attributes. Another way would be to utilize weight by correlation operator which looks like below figure. There are other weight based operators you can experiment with in Studio.
As far as summary statistics goes you can check them out with the Statistics tab on your raw data import.
I hope this helps.Best,
Harshit7