Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Generating new data out of modeling results
Hi,
I am a new user of Rapidminer and I have less experience in data mining. I am looking for a way to select lets say the best 20 % of a customer base, based on the results of a decision tree, a neural net or something.
Is there an operator which is able to write a new table considering the result of a previous modeling operator? The ideal case would be an operator, which sets something like a scoring attribute so that I can generate a new table and manually select the top rated data sets.
I would be grateful for any help.
I am a new user of Rapidminer and I have less experience in data mining. I am looking for a way to select lets say the best 20 % of a customer base, based on the results of a decision tree, a neural net or something.
Is there an operator which is able to write a new table considering the result of a previous modeling operator? The ideal case would be an operator, which sets something like a scoring attribute so that I can generate a new table and manually select the top rated data sets.
I would be grateful for any help.
0
Answers
In cases like this KNIME decision tree gives me the option to append columns with normalized class distribution to the table, so that I can choose my best 20 % out of it. In contrast with KNIME, the decision tree operator of rapidminer delivers only a tree which doesn't solve my problem.
in RapidMiner we separate the steps of model creation (training) and model application. From your description it seems that so far you only did the training step which results in a decision tree. Now you can apply that decision on new data.
The result will be an example set with three additional attributes: the prediction (e.g. true or false), and so-called confidences. The confidence is a measure of how sure or confident the model is, that the input data is of a certain class.
Please have a look at the attached process for a basic example of model training and application. In addition to the aforementioned operators, the process uses the Split operator to divide the input data into a set for training and a set for application.
For a deeper understanding of RapidMiner's concept I would like to direct your attention to our video tutorials and other documentation resources on our website at http://rapid-i.com . You'll find all the documentation in the Documentation menu on top of the website.
Best regards,
Marius
thank you very much.
I have an another problem. In my example code I discretize attribute 1 creating 10 bins. If I switch to the results view and activate the bars chart with these 10 bins in the x-axis, the bars are listed in alphabetical order (range1, range10, range2...). If I use the replace operator and replace the values manually, e.g. range1 with 01, it replaces also range10 with 010 :-[
Is there a way to put the bars into correct order? Or better: is there a way to name the bins right from the start as I want (as I can do it in KNIME ;D)?