Retrieve ZIP from URL

kavuchkavuch Member Posts: 6 Contributor I
edited November 2018 in Help
Is it possible to import a ZIP file from a URL?
There is the Loop Zip-File Entries-operator from the package Web Mining, but it seems to require a file from a local directory.
Can I store a ZIP from a URL on my local machine and process it afterwards?

Answers

  • Marco_BoeckMarco_Boeck Team Lead Software Engineering Moderator, Employee, Member, University Professor Posts: 1,806   RM Engineering
    Hi,

    yes you can :) See the example process below:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.6.000-SNAPSHOT">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="6.6.000-SNAPSHOT" expanded="true" name="Process">
       <process expanded="true">
         <operator activated="true" class="open_file" compatibility="6.6.000-SNAPSHOT" expanded="true" height="68" name="Open File" width="90" x="45" y="34">
           <parameter key="resource_type" value="URL"/>
           <parameter key="url" value="http://URL_TO_ZIP"/>
         </operator>
         <operator activated="true" class="loop_zipfile_entries" compatibility="6.6.000-SNAPSHOT" expanded="true" height="82" name="Loop Zip-File Entries" width="90" x="246" y="34">
           <process expanded="true">
             <operator activated="true" class="write_file" compatibility="6.6.000-SNAPSHOT" expanded="true" height="68" name="Write File" width="90" x="45" y="34">
               <parameter key="filename" value="C:\Users\USER_NAME\Desktop\Test\%{file_name}"/>
             </operator>
             <connect from_port="file object" to_op="Write File" to_port="file"/>
             <connect from_op="Write File" from_port="file" to_port="out 1"/>
             <portSpacing port="source_file object" spacing="0"/>
             <portSpacing port="source_in 1" spacing="0"/>
             <portSpacing port="sink_out 1" spacing="0"/>
             <portSpacing port="sink_out 2" spacing="0"/>
           </process>
         </operator>
         <connect from_op="Open File" from_port="file" to_op="Loop Zip-File Entries" to_port="file"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
       </process>
     </operator>
    </process>
    Regards,
    Marco
    naveen_paul
  • kc8g15kc8g15 Member Posts: 1 Contributor I

    How would you do this without coding and using operators?

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,132  RM Data Scientist

    hi!

     

    Did you have a look on Marco's process? This solves it in my eyes

     

    ~martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.