The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
How to use local features such as pixel values in a image classification setup?
Dear All,
I'm using the awesome rapid miner image plugin (IMMI).
http://splab.cz/en/research/data-mining/articles
I'm wondering, how to use local features such as pixel values in a image classification setup?
See my project setup below.
If needed I can also provide the data that I'm using.
Best regards,
Wessel
I'm using the awesome rapid miner image plugin (IMMI).
http://splab.cz/en/research/data-mining/articles
I'm wondering, how to use local features such as pixel values in a image classification setup?
See my project setup below.
If needed I can also provide the data that I'm using.
Best regards,
Wessel
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
<process expanded="true" height="667" width="570">
<operator activated="true" class="imageprocessing:multiple_color_image_opener" compatibility="1.4.001" expanded="true" height="60" name="MCIO" width="90" x="45" y="30">
<list key="images">
<parameter key="6" value="C:\Users\wessel\Desktop\data\6"/>
<parameter key="7" value="C:\Users\wessel\Desktop\data\7"/>
<parameter key="8" value="C:\Users\wessel\Desktop\data\8"/>
</list>
<parameter key="assign_label" value="true"/>
<process expanded="true" height="649" width="433">
<operator activated="true" class="imageprocessing:global_feature_extraction" compatibility="1.4.001" expanded="true" height="60" name="Global Feature Extractor from a Single Image" width="90" x="246" y="30">
<process expanded="true" height="667" width="433">
<operator activated="false" class="imageprocessing:statistics" compatibility="1.4.001" expanded="true" height="60" name="Global statistics" width="90" x="45" y="30">
<parameter key="Center of Mass" value="true"/>
<parameter key="Thickness" value="196"/>
</operator>
<operator activated="false" class="imageprocessing:BIC" compatibility="1.4.001" expanded="true" height="76" name="BIC" width="90" x="45" y="120"/>
<operator activated="true" class="imageprocessing:histogram" compatibility="1.4.001" expanded="true" height="60" name="histogram" width="90" x="45" y="210">
<parameter key="Bins" value="232"/>
</operator>
<operator activated="false" class="imageprocessing:dLog_distance" compatibility="1.4.001" expanded="true" height="60" name="dLog" width="90" x="179" y="120"/>
<operator activated="true" class="imageprocessing:color_to_grayscale" compatibility="1.4.001" expanded="true" height="60" name="Color to grayscale" width="90" x="45" y="300"/>
<operator activated="true" class="imageprocessing:skeletonize" compatibility="1.4.001" expanded="true" height="60" name="Skeletonize" width="90" x="163" y="300"/>
<operator activated="true" class="imageprocessing:obcf" compatibility="1.4.001" expanded="true" height="60" name="OBCF" width="90" x="313" y="300"/>
<connect from_port="color image plus 1" to_op="histogram" to_port="color image plus"/>
<connect from_port="color image plus 2" to_op="Color to grayscale" to_port="color image plus"/>
<connect from_op="BIC" from_port="grayscale image plus" to_op="dLog" to_port="grayscale image plus Hist"/>
<connect from_op="histogram" from_port="features" to_port="feature 1"/>
<connect from_op="Color to grayscale" from_port="grayscale image" to_op="Skeletonize" to_port="grayscale image plus"/>
<connect from_op="Skeletonize" from_port="grayscale image plus" to_op="OBCF" to_port="grayscale image plus"/>
<connect from_op="OBCF" from_port="features" to_port="feature 2"/>
<portSpacing port="source_color image plus 1" spacing="0"/>
<portSpacing port="source_color image plus 2" spacing="0"/>
<portSpacing port="source_color image plus 3" spacing="0"/>
<portSpacing port="sink_feature 1" spacing="162"/>
<portSpacing port="sink_feature 2" spacing="72"/>
<portSpacing port="sink_feature 3" spacing="0"/>
</process>
</operator>
<connect from_port="color image plus" to_op="Global Feature Extractor from a Single Image" to_port="color image plus"/>
<connect from_op="Global Feature Extractor from a Single Image" from_port="example set" to_port="Example set"/>
<portSpacing port="source_color image plus" spacing="0"/>
<portSpacing port="sink_Example set" spacing="0"/>
</process>
</operator>
<operator activated="true" class="normalize" compatibility="5.2.008" expanded="true" height="94" name="Normalize" width="90" x="180" y="30"/>
<operator activated="true" class="optimize_selection_forward" compatibility="5.2.008" expanded="true" height="94" name="Forward Selection" width="90" x="315" y="30">
<process expanded="true" height="667" width="300">
<operator activated="true" class="x_validation" compatibility="5.2.008" expanded="true" height="112" name="InnerValidation" width="90" x="45" y="30">
<process expanded="true" height="667" width="219">
<operator activated="true" class="weka:W-KStar" compatibility="5.1.001" expanded="true" height="76" name="W-KStar (2)" width="90" x="99" y="30"/>
<connect from_port="training" to_op="W-KStar (2)" to_port="training set"/>
<connect from_op="W-KStar (2)" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true" height="667" width="287">
<operator activated="true" class="apply_model" compatibility="5.2.008" expanded="true" height="76" name="Apply Model (2)" width="90" x="45" y="30">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance_classification" compatibility="5.2.008" expanded="true" height="76" name="Performance (2)" width="90" x="167" y="30">
<list key="class_weights"/>
</operator>
<connect from_port="model" to_op="Apply Model (2)" to_port="model"/>
<connect from_port="test set" to_op="Apply Model (2)" to_port="unlabelled data"/>
<connect from_op="Apply Model (2)" from_port="labelled data" to_op="Performance (2)" to_port="labelled data"/>
<connect from_op="Performance (2)" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="log" compatibility="5.2.008" expanded="true" height="76" name="Log" width="90" x="180" y="30">
<list key="log">
<parameter key="p" value="operator.InnerValidation.value.performance"/>
<parameter key="f" value="operator.Forward Selection.value.feature_names"/>
</list>
</operator>
<connect from_port="example set" to_op="InnerValidation" to_port="training"/>
<connect from_op="InnerValidation" from_port="averagable 1" to_op="Log" to_port="through 1"/>
<connect from_op="Log" from_port="through 1" to_port="performance"/>
<portSpacing port="source_example set" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
</process>
</operator>
<operator activated="true" class="select_by_weights" compatibility="5.2.008" expanded="true" height="94" name="Select by Weights" width="90" x="450" y="30"/>
<operator activated="true" class="x_validation" compatibility="5.2.008" expanded="true" height="112" name="Validation" width="90" x="45" y="120">
<process expanded="true" height="667" width="212">
<operator activated="true" class="weka:W-KStar" compatibility="5.1.001" expanded="true" height="76" name="W-KStar" width="90" x="112" y="30"/>
<connect from_port="training" to_op="W-KStar" to_port="training set"/>
<connect from_op="W-KStar" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true" height="667" width="300">
<operator activated="true" class="apply_model" compatibility="5.2.008" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance_classification" compatibility="5.2.008" expanded="true" height="76" name="Performance" width="90" x="180" y="30">
<list key="class_weights"/>
</operator>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<connect from_op="MCIO" from_port="example set" to_op="Normalize" to_port="example set input"/>
<connect from_op="Normalize" from_port="example set output" to_op="Forward Selection" to_port="example set"/>
<connect from_op="Forward Selection" from_port="example set" to_op="Select by Weights" to_port="example set input"/>
<connect from_op="Forward Selection" from_port="attribute weights" to_op="Select by Weights" to_port="weights"/>
<connect from_op="Select by Weights" from_port="example set output" to_op="Validation" to_port="training"/>
<connect from_op="Validation" from_port="averagable 1" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="126"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Tagged:
0
Answers
IMMI is not designed to use local features for classification. It is because each local feature means one example. I try to add local features to your process, but I didn't test it. It tries to convert example set with 100 examples into one example and then combine it with global features. I think you can design better solution than mine
Best,
Václav
P.S.
I deleted some parts because forum message can have "only" 20 000 characters.
I uploaded my data-set here: http://www.few.vu.nl/~wln320/data.zip
This dataset is generated using a CAPTCHA script.
The task is to figure out the number of symbols in the CAPTCHA image (labels: 6-symbols, 7-symbols, 8-symbols).
This data-set is synthetic and free of noise, so a kappa score of 100% should be possible.
Any suggestions what approach would be good for this task?
Using the very simple setup pasted below I managed to get a kappa score of around 0.4 (which corresponds to about 60% accuracy).
I tried an approach similar to eigenfaces which gets a kappa of around 0.5.
Should be possible to get a kappa of at least 0.9.
Best regards,
Wessel
I tried some processes but still have no luck. There is problem, that some letters are opened, so preprocessing like fill holes cannot be used. My results are:
kappa: 0.405 +/- 0.189 (mikro: 0.406)
true 6 true 7 true 8 class precision
pred. 6 43 16 4 68.25%
pred. 7 3 5 4 41.67%
pred. 8 7 26 46 58.23%
class recall 81.13% 10.64% 85.19%
If you have some tips for new feature extractors or preprocessing operators that are suitable for this task, I can add them.
Best,
Václav