🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉

Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.

CLICK HERE TO DOWNLOAD

"Loop Files

dragoljubdragoljub Member Posts: 241  Maven
edited June 2019 in Help
Hey Guys,

I would like to use loop files to loop over a bunch of tab delimited files and append them all together. It seems like this should be easily achievable with remember and recall but I cant seem to figure out how to set it up so that the first file is not appended more than once. Maybe I should Just CAT the files, however like that RM checks the attributes prior to appending.

Basically since only one file is accessed during each loop iteration somehow I need to start appending in the second iteration.  ???

Answers

  • haddockhaddock Member Posts: 849  Maven
    Read one file, trash examples, loop and append all files, like this..



    http://rapid-i.com/rapidforum/index.php/topic,2282.msg8989.html#msg8989
  • dragoljubdragoljub Member Posts: 241  Maven
    Hey Haddock,

    I think I see what you mean however your link connects back to this page :P. What example did you have in mind?

    -Gagi
  • haddockhaddock Member Posts: 849  Maven
    Hi,

    Oops. Just changed it.
  • dragoljubdragoljub Member Posts: 241  Maven
    The way I understand this would be to use a branch or select sub process to select when the first file is read.

    I am running into the problem of extracting the loop file iteration value. If I extract apply_count or loop_count from the loop 'loop files' operator I can branch on the first iteration to remember the first file and then in the following loops append and remember the next files.

    How can I extract the loop iteration from loop_files? I feel like I have tried all combination of extract macro operators.  ???

    -Gagi
  • haddockhaddock Member Posts: 849  Maven
    Hi there,

    Try this...

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.0.0" expanded="true" name="Process">
        <process expanded="true" height="632" width="1044">
          <operator activated="true" class="loop_files" compatibility="5.0.0" expanded="true" height="76" name="Loop Files" width="90" x="108" y="92">
            <parameter key="directory" value="C:\Documents and Settings\Alien\My Documents\rm_workspace"/>
            <parameter key="filter" value=".*csv"/>
            <parameter key="iterate_over_subdirs" value="true"/>
            <process expanded="true" height="296" width="705">
              <operator activated="true" class="set_macro" compatibility="5.0.8" expanded="true" height="76" name="Set Macro" width="90" x="26" y="28">
                <parameter key="macro" value="iteration"/>
                <parameter key="value" value="%{a}"/>
              </operator>
              <operator activated="true" class="provide_macro_as_log_value" compatibility="5.0.8" expanded="true" height="76" name="Provide Macro as Log Value" width="90" x="179" y="30">
                <parameter key="macro_name" value="file_path"/>
              </operator>
              <operator activated="true" class="provide_macro_as_log_value" compatibility="5.0.8" expanded="true" height="76" name="Provide Macro as Log Value (2)" width="90" x="313" y="30">
                <parameter key="macro_name" value="file_name"/>
              </operator>
              <operator activated="true" class="provide_macro_as_log_value" compatibility="5.0.8" expanded="true" height="76" name="Provide Macro as Log Value (3)" width="90" x="447" y="30">
                <parameter key="macro_name" value="iteration"/>
              </operator>
              <operator activated="true" class="log" compatibility="5.0.8" expanded="true" height="76" name="Log" width="90" x="585" y="30">
                <list key="log">
                  <parameter key="path" value="operator.Provide Macro as Log Value.value.macro_value"/>
                  <parameter key="name" value="operator.Provide Macro as Log Value (2).value.macro_value"/>
                  <parameter key="iteration" value="operator.Provide Macro as Log Value (3).value.macro_value"/>
                </list>
              </operator>
              <connect from_port="in 1" to_op="Set Macro" to_port="through 1"/>
              <connect from_op="Set Macro" from_port="through 1" to_op="Provide Macro as Log Value" to_port="through 1"/>
              <connect from_op="Provide Macro as Log Value" from_port="through 1" to_op="Provide Macro as Log Value (2)" to_port="through 1"/>
              <connect from_op="Provide Macro as Log Value (2)" from_port="through 1" to_op="Provide Macro as Log Value (3)" to_port="through 1"/>
              <connect from_op="Provide Macro as Log Value (3)" from_port="through 1" to_op="Log" to_port="through 1"/>
              <portSpacing port="source_in 1" spacing="0"/>
              <portSpacing port="source_in 2" spacing="0"/>
            </process>
          </operator>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
        </process>
      </operator>
    </process>
  • dragoljubdragoljub Member Posts: 241  Maven
    I knew I had seen that apply count macro somewhere before. Turns out its Page 50 RM 4.5 Manual!

    Thanks again! I solved the problem by making my own count macro and incrementing in the loop :) your process is just a bit less cumbersome.  ;D

    I did notice that it would not iterate through my files without another operator in the loop_files such as read csv. I wonder why it does not log anything with the process as it is?

    Thanks,
    -Gagi
Sign In or Register to comment.