Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Parameter classification in Material Science
Hello,
My task is related to object classification in the field of material science. The program RapidMiner is new to me, wherefore I want to draw on the broad knowledge of the community. I would be happy for response on m problem.
My problem (pre work excluded):
I have built a Excel list in the following way.
- First row: Object ID (running number) – approx. 4500
- Following rows: Object parameters (e.g. Area, Perimeter, …) – approx. 26
- Last row: belonging/label
In the first place I had for each label/class (total of 12) an own Excel list. I put all Data from those in one Excel list.
My Goals:
- To find a classification method (e.g. SVM) for training the problem in order to apply the model on unknown/not classified objects for getting their belonging.
- To find out which of the parameters are of interest for the model (optimize selection)
For now, I imported the Master Excel file with all objects in RapidMinder (Version 5.3) as followed:
- Object ID (running number): Integer; ID
- Object parameters: Real, Attributes
- Class (total of 12): Text; Label
From own research I started as follows (code can be found further down):
- Main Process
o Retrieve Data Excel file Repository
o Optimize selection
- Evalution Process
o Validation
- Training
o SVM Linear
- Testing
o Apply Model
o Performence
Is my approach correct? How would you build up the process structure in order to solve the problem?
If more information is needed I will provide it.
Thanks to any help and response.
My task is related to object classification in the field of material science. The program RapidMiner is new to me, wherefore I want to draw on the broad knowledge of the community. I would be happy for response on m problem.
My problem (pre work excluded):
I have built a Excel list in the following way.
- First row: Object ID (running number) – approx. 4500
- Following rows: Object parameters (e.g. Area, Perimeter, …) – approx. 26
- Last row: belonging/label
In the first place I had for each label/class (total of 12) an own Excel list. I put all Data from those in one Excel list.
My Goals:
- To find a classification method (e.g. SVM) for training the problem in order to apply the model on unknown/not classified objects for getting their belonging.
- To find out which of the parameters are of interest for the model (optimize selection)
For now, I imported the Master Excel file with all objects in RapidMinder (Version 5.3) as followed:
- Object ID (running number): Integer; ID
- Object parameters: Real, Attributes
- Class (total of 12): Text; Label
From own research I started as follows (code can be found further down):
- Main Process
o Retrieve Data Excel file Repository
o Optimize selection
- Evalution Process
o Validation
- Training
o SVM Linear
- Testing
o Apply Model
o Performence
Is my approach correct? How would you build up the process structure in order to solve the problem?
If more information is needed I will provide it.
Thanks to any help and response.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.015">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.015" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="5.3.015" expanded="true" height="60" name="Retrieve MasterExcel" width="90" x="45" y="75">
<parameter key="repository_entry" value="../MasterExcel/MasterExcel"/>
</operator>
<operator activated="true" class="optimize_selection_evolutionary" compatibility="5.3.015" expanded="true" height="94" name="Optimize Selection (Evolutionary)" width="90" x="246" y="75">
<process expanded="true">
<operator activated="true" class="x_validation" compatibility="5.3.015" expanded="true" height="112" name="Validation" width="90" x="45" y="30">
<process expanded="true">
<operator activated="true" class="support_vector_machine_linear" compatibility="5.3.015" expanded="true" height="76" name="SVM (Linear)" width="90" x="45" y="30"/>
<connect from_port="training" to_op="SVM (Linear)" to_port="training set"/>
<connect from_op="SVM (Linear)" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" class="apply_model" compatibility="5.3.015" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance" compatibility="5.3.015" expanded="true" height="76" name="Performance" width="90" x="180" y="30"/>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<connect from_port="example set" to_op="Validation" to_port="training"/>
<connect from_op="Validation" from_port="averagable 1" to_port="performance"/>
<portSpacing port="source_example set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
</process>
</operator>
<connect from_op="Retrieve MasterExcel" from_port="output" to_op="Optimize Selection (Evolutionary)" to_port="example set in"/>
<connect from_op="Optimize Selection (Evolutionary)" from_port="weights" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Tagged:
0