🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉

Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.

CLICK HERE TO DOWNLOAD

Storing Files as Blobs in Example Sets

JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 574   Unicorn
edited November 2018 in Help
Hi guys,

I want to store images as blobs in a database table for easy access and couldn't work out a way of doing it easily. 
Basically I want to work with a table like this of products. 

ID | Product | Image
1 | Clothes | [blob]
2 | Boots | [blob]
3 | Motorcycle | [blob]

The only way I came up with to do it was using the cryptography extension to convert the file into Base64 and store it as text, then to decrypt it in the same way. 
Whilst I can see this as being quite useful for some purposes (for example if I'm doing facial recognition or document analysis of images and want those files to be encrypted for extra security, I'm sure there must be a simpler way. 
See below for example process. 
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="6.4.000">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="6.4.000" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="open_file" compatibility="6.4.000" expanded="true" height="60" name="Open File" width="90" x="45" y="75">
        <parameter key="filename" value="/home/john/Desktop/test.jpeg"/>
      </operator>
      <operator activated="true" class="cryptography:pbe_encrypt_file" compatibility="1.1.000" expanded="true" height="60" name="Encrypt File (Password)" width="90" x="179" y="120">
        <parameter key="password" value="sPmh12yrKw0="/>
        <parameter key="base64" value="true"/>
      </operator>
      <operator activated="true" class="text:read_document" compatibility="6.4.001" expanded="true" height="60" name="Read Document" width="90" x="179" y="30">
        <parameter key="extract_text_only" value="false"/>
      </operator>
      <operator activated="true" breakpoints="after" class="text:documents_to_data" compatibility="6.4.001" expanded="true" height="76" name="Documents to Data" width="90" x="313" y="30">
        <parameter key="text_attribute" value="imagedata"/>
      </operator>
      <operator activated="true" class="extract_macro" compatibility="6.4.000" expanded="true" height="60" name="Extract Macro" width="90" x="380" y="120">
        <parameter key="macro" value="imagedata"/>
        <parameter key="macro_type" value="data_value"/>
        <parameter key="attribute_name" value="imagedata"/>
        <parameter key="example_index" value="1"/>
        <list key="additional_macros"/>
      </operator>
      <operator activated="true" class="text:create_document" compatibility="6.4.001" expanded="true" height="60" name="Create Document" width="90" x="246" y="300">
        <parameter key="text" value="%{imagedata}"/>
      </operator>
      <operator activated="true" class="text:write_document" compatibility="6.4.001" expanded="true" height="76" name="Write Document" width="90" x="380" y="300"/>
      <operator activated="true" class="cryptography:pbe_decrypt_file" compatibility="1.1.000" expanded="true" height="60" name="Decrypt File (Password)" width="90" x="514" y="255">
        <parameter key="password" value="sPmh12yrKw0="/>
        <parameter key="base64" value="true"/>
      </operator>
      <connect from_op="Open File" from_port="file" to_op="Encrypt File (Password)" to_port="file input"/>
      <connect from_op="Encrypt File (Password)" from_port="file output" to_op="Read Document" to_port="file"/>
      <connect from_op="Read Document" from_port="output" to_op="Documents to Data" to_port="documents 1"/>
      <connect from_op="Documents to Data" from_port="example set" to_op="Extract Macro" to_port="example set"/>
      <connect from_op="Create Document" from_port="output" to_op="Write Document" to_port="document"/>
      <connect from_op="Write Document" from_port="file" to_op="Decrypt File (Password)" to_port="file input"/>
      <connect from_op="Decrypt File (Password)" from_port="file output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

Answers

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,127  RM Data Scientist
    Hi,

    have you thought to store a collection of file objects? Then you can have two "tables"

    1. A collection of pictures
    2. The rest of the table

    And can join them if needed?

    I am not sure if this solves your problem but it is at least an idea.
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.