Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Polynomial Logistic Regression weird convergence issue? Bug?
Hi!
I'm wondering if anyone could share some insight into this problem. I'm trying to run an evolutionary search of parameters for the mentioned operator, but it won't converge.
I've been trying to find out why, but ran into some trouble pinning it down. Take the sonar data example proposed by the logistic regression operator:
2) Then I changed max iterations to 1 and started going up. 291 max iterations runs fine, again, almost instantly. 292 max iterations, one minute going already... won't converge.
I'm not a doctor in machine learning algorithms but this looks like a bug to me ;D...
Any ideas of how to cope with this while it gets fixed?
Thanks! I hope the bug report helps.
I'm wondering if anyone could share some insight into this problem. I'm trying to run an evolutionary search of parameters for the mentioned operator, but it won't converge.
I've been trying to find out why, but ran into some trouble pinning it down. Take the sonar data example proposed by the logistic regression operator:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>1) I simply changed to polynomial and set degree at 12. This runs instantly on my PC, but when changing degree to 13, it won't converge not even past 5 minutes.
<process version="5.3.015">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.015" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="5.3.015" expanded="true" height="60" name="Sonar" width="90" x="380" y="120">
<parameter key="repository_entry" value="//Samples/data/Sonar"/>
</operator>
<operator activated="true" class="split_validation" compatibility="5.3.015" expanded="true" height="112" name="Validation" width="90" x="514" y="120">
<process expanded="true">
<operator activated="true" class="logistic_regression" compatibility="5.3.015" expanded="true" height="94" name="Logistic Regression" width="90" x="112" y="30">
<parameter key="kernel_type" value="polynomial"/>
<parameter key="kernel_degree" value="12.0"/>
</operator>
<connect from_port="training" to_op="Logistic Regression" to_port="training set"/>
<connect from_op="Logistic Regression" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" class="apply_model" compatibility="5.3.015" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance" compatibility="5.3.015" expanded="true" height="76" name="Performance" width="90" x="179" y="30"/>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<connect from_op="Sonar" from_port="output" to_op="Validation" to_port="training"/>
<connect from_op="Validation" from_port="model" to_port="result 1"/>
<connect from_op="Validation" from_port="averagable 1" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="90"/>
<portSpacing port="sink_result 2" spacing="18"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
2) Then I changed max iterations to 1 and started going up. 291 max iterations runs fine, again, almost instantly. 292 max iterations, one minute going already... won't converge.
I'm not a doctor in machine learning algorithms but this looks like a bug to me ;D...
Any ideas of how to cope with this while it gets fixed?
Thanks! I hope the bug report helps.
0
Answers
this looks weird indeed. I have created an interal ticket for the issue. Thank you for reporting it!
Regards,
Marco
An update on anova and epach, on a real-case scenario (1000/1000 cases, 17 predictors, evo search) anova also does get stuck immediately. Epach runs but seems to be the slowest, by far, of all the log. reg. kernel types.
This anova runs slowly but at least it runs: RM gets a little bit unresponsive though. Maybe something overloads something else, somewhere, somehow?.. some combinations of parameters seem to be growing the problem exponentially, sometimes to a point of no return.
Oh and by the way, I think this is the first time I've seen this... C takes values beyond the '10' limit, maybe it's related, or maybe just another bug, I don't know.
I forgot to mention this, but this has been fixed in RM Studio 6.0.005
Regards,
Marco