"RM cloud dropbox connnect and loop files"

IljaDeCosterIljaDeCoster Member Posts: 7 Contributor II
edited June 2019 in Help
Hi all,

I have some processes using a loop_files operator that work fine localy using the Studio.

I'm now moving the to the RM Cloud, connected to a dropbox account.

How do I combine the read dropbox operatior with the loop files (where in I have a read csv)?

Thanks!
Ilja

Answers

  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 578 Unicorn
    Hi Ilja,

    I'm not sure they currently can work together in the current form, there would need to be a 'loop dropbox files' operator written. 
     
    However, all is not lost! 
    DropBox has a REST API which means  can use the GetPage operator to download a list of the files in the directory you are targeting and then use LoopExamples or LoopValues with the ReadDropBox operator inside the nest and provide it the filename of each example as a Macro. 

    Sorry I'm not able to provide a sample process for using GetPage in this way, but DropBox.com is blocked where I am.  Hopefully you can understand the basics from the below.
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.4.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="6.4.000" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="web:get_webpage" compatibility="5.3.002" expanded="true" height="60" name="Get Page" width="90" x="45" y="75">
            <parameter key="url" value="dropboxapi"/>
            <list key="query_parameters"/>
            <list key="request_properties">
              <parameter key="auth" value="some authentication string"/>
              <parameter key="directory" value="I guess you geed it with directory info"/>
            </list>
            <description align="center" color="transparent" colored="false" width="126">You'd need a layout similar to this.</description>
          </operator>
          <operator activated="true" class="text:json_to_data" compatibility="6.4.000" expanded="true" height="76" name="JSON To Data" width="90" x="179" y="255">
            <description align="center" color="transparent" colored="false" width="126">The data is probably returned in JSON. If it's returned in XML use the ReadXML operator.</description>
          </operator>
          <operator activated="true" class="loop_values" compatibility="6.4.000" expanded="true" height="76" name="Loop Values" width="90" x="313" y="75">
            <parameter key="attribute" value="filename"/>
            <parameter key="iteration_macro" value="%{loop_value}"/>
            <process expanded="true">
              <operator activated="true" class="cloud_connectivity:read_dropbox" compatibility="6.3.000" expanded="true" height="60" name="Read Dropbox" width="90" x="246" y="120">
                <parameter key="connection" value="your_dropbox_connection"/>
                <parameter key="file" value="loop_value"/>
                <description align="center" color="transparent" colored="false" width="126">Here is the read dropbox operator.</description>
              </operator>
              <connect from_op="Read Dropbox" from_port="file" to_port="out 1"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="sink_out 1" spacing="0"/>
              <portSpacing port="sink_out 2" spacing="0"/>
              <description align="center" color="yellow" colored="false" height="107" resized="false" width="180" x="10" y="10">No need to connect the examples to the dropbox operator you have the filename of each example in the &amp;quot;loop_value&amp;quot; macro.</description>
            </process>
            <description align="center" color="transparent" colored="false" width="126">Use LoopValues to provide the inner operators with the Filename as a macro.</description>
          </operator>
          <connect from_op="Get Page" from_port="output" to_op="JSON To Data" to_port="documents 1"/>
          <connect from_op="JSON To Data" from_port="example set" to_op="Loop Values" to_port="example set"/>
          <connect from_op="Loop Values" from_port="out 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
  • IljaDeCosterIljaDeCoster Member Posts: 7 Contributor II
    Hi, thanks for this reply.

    I'll have a look at it soon.

    I'll keep you posted ;-)

    Greetings,
    Ilja
  • IljaDeCosterIljaDeCoster Member Posts: 7 Contributor II
    Hi again,

    Thanks for the input again.

    It seems indeed that this way it would be possible to solve my problem. It did some initial trail. But... Some issues with finding the right api urls

    As I'm not a programmer (I use RapidMiner as it is code free :-) - I do have my very limits in coding and finding my way in the dropobx api documentation.

    I'll play a little more, but can't put too mush time in it.

    So I would realy suggest my friends as RapidMiner to add a code free operator to do loop_files with the dropobox connector :-)

    Greetings
    Ilja
  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 578 Unicorn
    If you're able to send me a PM with the API documentation that you're having trouble with I might be able to resolve it. 
    I used to use Twitter connections a lot (before there was the code-free option) and I'm pretty sure this should be similar. 
  • IljaDeCosterIljaDeCoster Member Posts: 7 Contributor II
    Thanks JEdward,

    That would be really helpful if you would like to do this.

    The dropbox api documentation is at: https://www.dropbox.com/developers/core

    Apart for that to use the dropbox api you need an acces token made by Oauth. Seems easy to do if I read, but I never did that before and it may take some time.
    https://www.dropbox.com/developers/reference/oauthguide

    By the way: I now did find a code-free variant. Instead of using csv in my dropbox with a loop_files, I now push those csv's into the repository as datasets and then use loop_repository.
    It works for what I'm doing now, but I'll need the dropbox solution as it have more possibilities.

    Thanks again for you help.
    Ilja
Sign In or Register to comment.