Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
"Error using Adaboost"
inthewoods
Member Posts: 9 Contributor II
I get the following errors when I try and run a simulation with the Adaboost component:
Exception: com.rapidminer.example.AttributeTypeException
Message: Cannot map index of nominal attribute to nominal value: index 4 is out of bounds!
Stack trace:
com.rapidminer.example.table.PolynominalMapping.mapIndex(PolynominalMapping.java:137)
com.rapidminer.operator.learner.meta.AdaBoostModel.evaluateSpecialAttributes(AdaBoostModel.java:231)
com.rapidminer.operator.learner.meta.AdaBoostModel.performPrediction(AdaBoostModel.java:166)
com.rapidminer.operator.learner.PredictionModel.apply(PredictionModel.java:76)
com.rapidminer.operator.ModelApplier.doWork(ModelApplier.java:100)
com.rapidminer.operator.Operator.execute(Operator.java:771)
com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:709)
com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:368)
com.rapidminer.operator.Operator.execute(Operator.java:771)
com.rapidminer.Process.run(Process.java:899)
com.rapidminer.Process.run(Process.java:795)
com.rapidminer.Process.run(Process.java:790)
com.rapidminer.Process.run(Process.java:780)
com.rapidminer.gui.ProcessThread.run(ProcessThread.java:62)
Here's the setup:
Thanks!
Exception: com.rapidminer.example.AttributeTypeException
Message: Cannot map index of nominal attribute to nominal value: index 4 is out of bounds!
Stack trace:
com.rapidminer.example.table.PolynominalMapping.mapIndex(PolynominalMapping.java:137)
com.rapidminer.operator.learner.meta.AdaBoostModel.evaluateSpecialAttributes(AdaBoostModel.java:231)
com.rapidminer.operator.learner.meta.AdaBoostModel.performPrediction(AdaBoostModel.java:166)
com.rapidminer.operator.learner.PredictionModel.apply(PredictionModel.java:76)
com.rapidminer.operator.ModelApplier.doWork(ModelApplier.java:100)
com.rapidminer.operator.Operator.execute(Operator.java:771)
com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:709)
com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:368)
com.rapidminer.operator.Operator.execute(Operator.java:771)
com.rapidminer.Process.run(Process.java:899)
com.rapidminer.Process.run(Process.java:795)
com.rapidminer.Process.run(Process.java:790)
com.rapidminer.Process.run(Process.java:780)
com.rapidminer.gui.ProcessThread.run(ProcessThread.java:62)
Here's the setup:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>I've submitted the bug, but I was wondering if anyone had any insight as to what I'm doing wrong.
<process version="5.0">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.0.11" expanded="true" name="Process">
<process expanded="true" height="341" width="605">
<operator activated="true" class="retrieve" compatibility="5.0.11" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
<parameter key="repository_entry" value="//NewLocalRepository/SPY_test_data"/>
</operator>
<operator activated="true" class="retrieve" compatibility="5.0.11" expanded="true" height="60" name="Retrieve (2)" width="90" x="99" y="164">
<parameter key="repository_entry" value="//NewLocalRepository/SPY_apply_model"/>
</operator>
<operator activated="true" class="series:windowing" compatibility="5.0.2" expanded="true" height="76" name="Windowing" width="90" x="179" y="30">
<parameter key="horizon" value="1"/>
<parameter key="window_size" value="1"/>
<parameter key="create_label" value="true"/>
<parameter key="label_attribute" value="ROC-1"/>
</operator>
<operator activated="true" class="adaboost" compatibility="5.0.11" expanded="true" height="76" name="AdaBoost" width="90" x="357" y="35">
<process expanded="true" height="315" width="605">
<operator activated="true" class="parallel:decision_tree_weight_based_parallel" compatibility="5.0.1" expanded="true" height="60" name="DecisionTree (Weight-Based)" width="90" x="243" y="58">
<process expanded="true">
<portSpacing port="source_training set" spacing="0"/>
<portSpacing port="sink_weights" spacing="0"/>
</process>
</operator>
<connect from_port="training set" to_op="DecisionTree (Weight-Based)" to_port="training set"/>
<connect from_op="DecisionTree (Weight-Based)" from_port="model" to_port="model"/>
<portSpacing port="source_training set" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
</process>
</operator>
<operator activated="true" class="apply_model" compatibility="5.0.11" expanded="true" height="76" name="Apply Model (2)" width="90" x="380" y="210">
<list key="application_parameters"/>
</operator>
<connect from_op="Retrieve" from_port="output" to_op="Windowing" to_port="example set input"/>
<connect from_op="Retrieve (2)" from_port="output" to_op="Apply Model (2)" to_port="unlabelled data"/>
<connect from_op="Windowing" from_port="example set output" to_op="AdaBoost" to_port="training set"/>
<connect from_op="AdaBoost" from_port="model" to_op="Apply Model (2)" to_port="model"/>
<connect from_op="Apply Model (2)" from_port="labelled data" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Thanks!
Tagged:
0
Answers
I guess you have different nominal values in your both data sets. I admit this shouldn't cause any problems, but if you first combine both datasets and after this split them into train and test set, this error won't happen.
Greetings,
Sebastian
actually nominal values don't have anything connected to math: Nominal values are non numerical values like words, etc. What can happen is:
You have a train data set that contains examples about things of two different colors like "red" and "green". But what happens if the color "blue" is now mentioned in the test set? actually this value isn't know to any model, because it simply cant know that it exists. This is a general problem and all what the model could (and definitively should do) is to throw a better and more detailed error message.
To avoid this problem: Append one data set to the other and split it again. Then the datasets know which values exists in the combined data! Then the model will cope with this.
Anyway I will search this problem causing the crash right now.
Greetings,
Sebastian