The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
"Decision tree shows only the label"
Hi,
I have a data set of Sales from B2B that I am trying to dig in using a "Decision Tree". The attributes are: Country (Polynominal), State (Polynominal), DaysInSalePhase (integer), MonthlySales (integer), Deal (Binominal). I set the "Deal" attribute (which is a "Won" or "Lost" column) as the label. But the decision tree almost never show attribute such Country, State, but focus only on integer values. If i want to see something in the decision tree, I have to disable all the prunning options (which is not the best, isn't it ?), and most of the time the decision tree is only a box showing me the the number of "Won" and "Lost".
Any idea what I am doing wrong ? Does my data aren't good enough?
David
I have a data set of Sales from B2B that I am trying to dig in using a "Decision Tree". The attributes are: Country (Polynominal), State (Polynominal), DaysInSalePhase (integer), MonthlySales (integer), Deal (Binominal). I set the "Deal" attribute (which is a "Won" or "Lost" column) as the label. But the decision tree almost never show attribute such Country, State, but focus only on integer values. If i want to see something in the decision tree, I have to disable all the prunning options (which is not the best, isn't it ?), and most of the time the decision tree is only a box showing me the the number of "Won" and "Lost".
Any idea what I am doing wrong ? Does my data aren't good enough?
David
Tagged:
0
Answers
you are on the correct path.
The decision tree only performs a split, i.e. inserts a node, if it can find an attribute that contributes and improves the quality of the tree. If it does not split by Country or State, it means that these attributes do not have a strong correlation to the label.
If there is only one node, then the tree did not find any useful attribute.
You can try to reduce the "minimal gain" parameter of the tree to allow for splits on less significant attributes.
You should probably also try another learning algorithm that may be better suited for your data.
Did you validate your tree with a cross validation to see how well it performs?
Best regards,
~Marius
DaysInSalePhase (integer),
MonthlySales (integer) - particularly this one.
Is it possible to return to the source data and calculate more attributes?
LastQuarterSales (integer)
LastMonthSales (integer)
MonthlySales (integer)
Try to increase the amount of information available to the model.
You can add more parameters to optimize, but this will increase compute time exponentially. Experiment and record the settings and results until you get a tree you are satisfied with.
J.