Hello i have a homework of rapidminer, can anybody help me?
You will use the following process:
1. Based on the training dataset,create a training sample and a validation sample by splitting the data into 2 groups. Steps 2-5 below will then be performed on the training and the validation data.
2. Set up the dependent variable
Make a preliminary assessment of the relative importance of the explanatory variables using visualization tools and simple descriptive statistics.
Estimate the classification model using the training data,and interpret the results.
Assess the accuracy of classification with the validation sample, possibly repeating steps 2-5 a few times changing the classifier in different ways to increase performance.
Finally, score each observation of the scoring dataset and determine the list of applicants with a good credit risk (probability equal of higher than to 0.80) that the marketing department will be able to contact.