Options

"Lookup data

felix_laksanafelix_laksana Member Posts: 12 Contributor II
edited June 2019 in Help
Hi there.


I'm new to rapid miner and I have some difficulty for doing some of my case :
1. Lookup value from another data..
Is there any lookup operator like in MS excel or SSIS (MSSQL)? I already try to search for the operator but I didn't find any..

2. Is there any operator or ways to write the data into the XML file?


Thanks,
Felix
Tagged:

Answers

  • Options
    homburghomburg Moderator, Employee, Member Posts: 114 RM Data Scientist
    Hi,

    if you want to find out whether there is a certain pattern in your data, you first have to load it to RapidMiner. Once this is done you may use a filter to match to some value or regular expression. Finally there are lots of possibilities to store your findings or proceed using this knowledge processing further actions.
    How does your data look like and what do you want to do with your findings?

    Cheers,
    Helge
  • Options
    felix_laksanafelix_laksana Member Posts: 12 Contributor II
    Hi,

    Thanks for your help, actually I already done it using left join for lookup the data..

    Ohh this is actually the xml data looks like :
    <?xml version="1.0" encoding="utf-8"?>
    <data>
          <contract_id>int</contract_id>
          <amt_financed>decimal</amt_financed>
          <term>int</term>
          <customer>
                <party_id>int</party_id>
                <name>string</name>
                <sex>string</sex>
                <occupation>string</occupation>
                <salary>decimal</salary>
          </customer>
          <supplier>
                <party_id>int</party_id>
                <name>string</name>
                <supplier_segment>string</supplier_segment>
          </supplier>
    </data>
    I need the xml formats to support an object data type like in the example (customer might have multiple data)..
    And I need the xml perfectly the same format because I need it to upload to the system, which is only accept that kind of format..

    Thanks :)
    Felix
  • Options
    homburghomburg Moderator, Employee, Member Posts: 114 RM Data Scientist
    Hi Felix,

    there is no fully customizable XML output generator. But nevertheless you may build a process that does what you need. Please install the text mining extension for RapidMiner (Help -> Udates and Extensions, search for text mining) and open the following process, which shows how yours may look like:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.0.008">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="6.0.008" expanded="true" name="Process">
       <process expanded="true">
         <operator activated="true" class="retrieve" compatibility="6.0.008" expanded="true" height="60" name="Retrieve Golf" width="90" x="45" y="210">
           <parameter key="repository_entry" value="//Samples/data/Golf"/>
         </operator>
         <operator activated="true" class="loop_examples" compatibility="6.0.008" expanded="true" height="94" name="Loop Examples" width="90" x="246" y="210">
           <process expanded="true">
             <operator activated="true" class="loop_attributes" compatibility="6.0.008" expanded="true" height="94" name="Loop Attributes" width="90" x="246" y="120">
               <process expanded="true">
                 <operator activated="true" class="text:extract_document" compatibility="5.3.002" expanded="true" height="76" name="Extract Document" width="90" x="246" y="165">
                   <parameter key="attribute_name" value="%{loop_attribute}"/>
                   <parameter key="example_index" value="%{example}"/>
                 </operator>
                 <operator activated="true" class="text:create_document" compatibility="5.3.002" expanded="true" height="60" name="CLose Tag (2)" width="90" x="246" y="615">
                   <parameter key="text" value="&lt;/%{loop_attribute}&gt;&#10;"/>
                 </operator>
                 <operator activated="true" class="text:create_document" compatibility="5.3.002" expanded="true" height="60" name="CLose Tag (3)" width="90" x="514" y="660">
                   <parameter key="text" value="&lt;/%{loop_attribute}&gt;&#10;"/>
                 </operator>
                 <operator activated="true" class="text:create_document" compatibility="5.3.002" expanded="true" height="60" name="Open Tag" width="90" x="246" y="75">
                   <parameter key="text" value="      &lt;%{loop_attribute}&gt;"/>
                 </operator>
                 <operator activated="true" class="text:create_document" compatibility="5.3.002" expanded="true" height="60" name="CLose Tag" width="90" x="246" y="255">
                   <parameter key="text" value="&lt;/%{loop_attribute}&gt;&#10;"/>
                 </operator>
                 <operator activated="true" class="text:combine_documents" compatibility="5.3.002" expanded="true" height="112" name="Combine Documents" width="90" x="447" y="165"/>
                 <connect from_port="example set" to_op="Extract Document" to_port="example set"/>
                 <connect from_op="Extract Document" from_port="document" to_op="Combine Documents" to_port="documents 2"/>
                 <connect from_op="Open Tag" from_port="output" to_op="Combine Documents" to_port="documents 1"/>
                 <connect from_op="CLose Tag" from_port="output" to_op="Combine Documents" to_port="documents 3"/>
                 <connect from_op="Combine Documents" from_port="document" to_port="result 1"/>
                 <portSpacing port="source_example set" spacing="0"/>
                 <portSpacing port="sink_example set" spacing="0"/>
                 <portSpacing port="sink_result 1" spacing="0"/>
                 <portSpacing port="sink_result 2" spacing="0"/>
               </process>
             </operator>
             <operator activated="true" class="text:create_document" compatibility="5.3.002" expanded="true" height="60" name="Close XML" width="90" x="246" y="255">
               <parameter key="text" value="&lt;/data&gt;"/>
             </operator>
             <operator activated="true" class="text:create_document" compatibility="5.3.002" expanded="true" height="60" name="Begin XML" width="90" x="246" y="30">
               <parameter key="text" value="&lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot;?&gt;&#10;&lt;data&gt;&#10;"/>
             </operator>
             <operator activated="true" class="text:combine_documents" compatibility="5.3.002" expanded="true" height="112" name="Combine Documents (2)" width="90" x="447" y="120"/>
             <operator activated="true" class="text:write_document" compatibility="5.3.002" expanded="true" height="76" name="Write Document" width="90" x="581" y="120">
               <parameter key="file" value="C:\Users\hhomburg\Documents\out_%{example}.xml"/>
             </operator>
             <connect from_port="example set" to_op="Loop Attributes" to_port="example set"/>
             <connect from_op="Loop Attributes" from_port="result 1" to_op="Combine Documents (2)" to_port="documents 2"/>
             <connect from_op="Close XML" from_port="output" to_op="Combine Documents (2)" to_port="documents 3"/>
             <connect from_op="Begin XML" from_port="output" to_op="Combine Documents (2)" to_port="documents 1"/>
             <connect from_op="Combine Documents (2)" from_port="document" to_op="Write Document" to_port="document"/>
             <connect from_op="Write Document" from_port="document" to_port="output 1"/>
             <portSpacing port="source_example set" spacing="0"/>
             <portSpacing port="sink_example set" spacing="0"/>
             <portSpacing port="sink_output 1" spacing="0"/>
             <portSpacing port="sink_output 2" spacing="0"/>
           </process>
         </operator>
         <connect from_op="Retrieve Golf" from_port="output" to_op="Loop Examples" to_port="example set"/>
         <connect from_op="Loop Examples" from_port="output 1" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
    </process>
    Cheers,
    Helge
Sign In or Register to comment.