The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
[SOLVED] Model from Generate Script generates no values
Hi,
Here is my process:
Thanks, gabor
PS: RM 5.2 Community edition; I was not sure whether this is Development topic or not, so no hard feelings if this gets moved.
Here is my process:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>Sorry, it is intentionally large for testing purposes. So, I create a PCAModel in Groov. It seems to be ok in the model output (as I see the same values comparing to the normal PCA operator's model). But when I try to apply that model to the same dataset, I get nothing, but missing values. I guess I did something obviously wrong, but I do not see where is the problem. Do you have idea?
<process version="5.2.008">
<context>
<input>
<location>//Samples/data/Polynomial</location>
</input>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
<process expanded="true" height="251" width="681">
<operator activated="true" class="multiply" compatibility="5.2.008" expanded="true" height="112" name="Multiply" width="90" x="45" y="30"/>
<operator activated="true" class="execute_script" compatibility="5.2.008" expanded="true" height="76" name="Execute Script" width="90" x="246" y="30">
<parameter key="script" value="import com.rapidminer.operator.features.transformation.PCAModel; /*String macroName = "temp_path"; Attribute macroValueAttribute = AttributeFactory.createAttribute("macroValue", com.rapidminer.tools.Ontology.NOMINAL); String macroValue = operator.getProcess().macroHandler.getMacro(macroName); macroValueAttribute.setMapping( new com.rapidminer.example.table.PolynominalMapping( //Collections.singletonMap(Integer.valueOf(0), macroValue) [0:macroValue] )); ExampleTable table = new MemoryExampleTable(macroValueAttribute); table.addDataRow(new IntArrayDataRow(0)); ExampleSet ret = new SimpleExampleSet(table); //String macroValue = operator.getProcess().macroHandler.getMacro(macroName); return [ret] as ExampleSet[];*/ ExampleSet exampleSet = input[0]; int dim = exampleSet.getAttributes().size(); /*double[] eigenValues = new double[dim]; double[][] eigenVectors = new double[dim][dim]; Random r = new Random(2); for (int i = dim; i-->0;) { 	eigenValues = 1.0;//11.0 - i; 	for (int j = dim; j-->0;) 		eigenVectors = 0.0;//r.nextDouble() - 0.5; 	eigenVectors[dim - i - 1] = 1; }*/ Jama.Matrix m = com.rapidminer.tools.math.matrix.CovarianceMatrix. getCovarianceMatrix(exampleSet); double[][] v = m.eig().getV().getArray(); Model model = new PCAModel(exampleSet, /*eigenValues*/m.eig().getRealEigenvalues(), /*eigenVectors*/v); model.setParameter("keep_attribues", "true"); model.setParameter("dimensionality_reduction", "none"); model.setParameter("number_of_components", Integer.toString(dim)); model.setParameter("variance_threshold", "1.0"); model.setNumberOfComponents(dim); return [model] as Model[];"/>
</operator>
<operator activated="true" class="principal_component_analysis" compatibility="5.2.008" expanded="true" height="94" name="PCA" width="90" x="246" y="120"/>
<operator activated="true" class="apply_model" compatibility="5.2.008" expanded="true" height="76" name="Apply Model (2)" width="90" x="380" y="165">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="apply_model" compatibility="5.2.008" expanded="true" height="76" name="Apply Model" width="90" x="380" y="30">
<list key="application_parameters"/>
</operator>
<connect from_port="input 1" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_op="Execute Script" to_port="input 1"/>
<connect from_op="Multiply" from_port="output 2" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Multiply" from_port="output 3" to_op="PCA" to_port="example set input"/>
<connect from_op="Execute Script" from_port="output 1" to_op="Apply Model" to_port="model"/>
<connect from_op="PCA" from_port="example set output" to_port="result 2"/>
<connect from_op="PCA" from_port="original" to_op="Apply Model (2)" to_port="unlabelled data"/>
<connect from_op="PCA" from_port="preprocessing model" to_op="Apply Model (2)" to_port="model"/>
<connect from_op="Apply Model (2)" from_port="labelled data" to_port="result 3"/>
<connect from_op="Apply Model (2)" from_port="model" to_port="result 5"/>
<connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
<connect from_op="Apply Model" from_port="model" to_port="result 4"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
<portSpacing port="sink_result 5" spacing="0"/>
<portSpacing port="sink_result 6" spacing="0"/>
</process>
</operator>
</process>
Thanks, gabor
PS: RM 5.2 Community edition; I was not sure whether this is Development topic or not, so no hard feelings if this gets moved.
0
Answers
It is interesting that those stats (attribute means) are not checked in the model constructor, just saved. (Maybe a check for NaNs would not take too long and could report errors. Also, some Javadoc would help a bit.)