process mutiple dataset files

mksaadmksaad Member Posts: 42  Guru
edited November 2018 in Help
Hello,

I need to apply decision tree on different 24 dataset file. Can I use loop control to achieve that??

Thanks,
--Motaz

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,525   Unicorn
    Hi Motaz,
    yes, you can, of course. Take a look at the Loop Files operator. It will store the names of the files or subdirectories into a macro that you can use for loading the files. If you name the macro fileName, then you have to enter as filename %{fileName}, so that the filename will be inserted instead of the macro.

    Greetings,
      Sebastian
  • mksaadmksaad Member Posts: 42  Guru
    Hi Sebastian,

    I would appreciate if give us a process example for that


    my process does not work  :-\
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input>
          <location/>
        </input>
        <output>
          <location/>
        </output>
        <macros/>
      </context>
      <operator activated="true" class="process" expanded="true" name="Process">
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true" height="386" width="614">
          <operator activated="true" class="loop_files" expanded="true" height="60" name="Loop Files" width="90" x="112" y="75">
            <parameter key="directory" value="C:\Program Files (x86)\Weka-3-7\data"/>
            <process expanded="true" height="416" width="746">
              <operator activated="true" class="read_arff" expanded="true" height="60" name="Read ARFF" width="90" x="45" y="75">
                <parameter key="data_file" value="%{file_name}"/>
              </operator>
              <portSpacing port="source_in 1" spacing="0"/>
            </process>
          </operator>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
        </process>
      </operator>
    </process>
    Thanks,
    --Motaz
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,525   Unicorn
    Hi Motaz,
    what exactly does not work? It seems to me, that you process is rather empty? Did you make a breakpoint after the read arff operator file to check if it loads the data?

    Greetings,
      Sebastian
  • mksaadmksaad Member Posts: 42  Guru
    Hi Sebastian,

    The I did not complete the process, but I could find the error below in arff reader inner operator :
    Error transforming meta data transformation: java.lang.NullPointerException
    So I could not load the files basically

    below is my process
    Thanks for your help,
    --Motaz
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input>
          <location/>
        </input>
        <output>
          <location/>
        </output>
        <macros/>
      </context>
      <operator activated="true" class="process" expanded="true" name="Process">
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true" height="416" width="746">
          <operator activated="true" class="loop_files" expanded="true" height="60" name="Loop Files" width="90" x="112" y="75">
            <parameter key="directory" value="C:\Program Files (x86)\Weka-3-7\data"/>
            <process expanded="true" height="416" width="746">
              <operator activated="true" class="read_arff" expanded="true" height="60" name="Read ARFF" width="90" x="45" y="75">
                <parameter key="data_file" value="%{file_name}"/>
              </operator>
              <operator activated="true" class="decision_tree" expanded="true" height="76" name="Decision Tree" width="90" x="220" y="70"/>
              <connect from_op="Read ARFF" from_port="output" to_op="Decision Tree" to_port="training set"/>
              <portSpacing port="source_in 1" spacing="0"/>
            </process>
          </operator>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
        </process>
      </operator>
    </process>
  • haddockhaddock Member Posts: 849  Guru
    Hi There Motaz,

    I think it has to do with just using the file_name rather than the file_path macro. In the following code, which loops through some parameter files, all is well unless I just use file_name.
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input>
          <location/>
        </input>
        <output>
          <location/>
        </output>
        <macros/>
      </context>
      <operator activated="true" class="process" expanded="true" name="Process">
        <process expanded="true" height="296" width="915">
          <operator activated="true" class="loop_files" expanded="true" height="60" name="Loop Files" width="90" x="112" y="75">
            <parameter key="directory" value="C:\Documents and Settings\Alien\My Documents\rm_workspace"/>
            <parameter key="filter" value="SVM.*8\.par"/>
            <process expanded="true" height="296" width="915">
              <operator activated="true" class="provide_macro_as_log_value" expanded="true" height="76" name="Provide Macro as Log Value" width="90" x="45" y="75">
                <parameter key="macro_name" value="file_path"/>
              </operator>
              <operator activated="true" class="log" expanded="true" height="76" name="Log" width="90" x="180" y="30">
                <list key="log">
                  <parameter key="file_name" value="operator.Provide Macro as Log Value.value.macro_value"/>
                </list>
              </operator>
              <operator activated="true" class="read_parameters" expanded="true" height="60" name="Read Parameters" width="90" x="313" y="165">
                <parameter key="parameter_file" value="%{file_path}"/>
              </operator>
              <connect from_op="Provide Macro as Log Value" from_port="through 1" to_op="Log" to_port="through 1"/>
              <portSpacing port="source_in 1" spacing="0"/>
            </process>
          </operator>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
        </process>
      </operator>
    </process>
    Onward through the fog ...

  • ripkarsripkars Member Posts: 4 Contributor I
    I'm experiencing the same problem.
    It seems I can't correctly read the file_path macro value inside the Loop Files operator.
    Any help?
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,525   Unicorn
    Hi,
    What's your exact problem in reading the macro?

    Greetings,
      Sebastian
  • mksaadmksaad Member Posts: 42  Guru
    Hi Sebastian,


    I also tried to make it work but I failed. the problem is that there is no sample process example about that. I think providing sample process will simplify our life  :)

    Greetings,
    --Motaz
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,643  RM Founder
    Hi,

    I think providing sample process will simplify our life
    Could be. But RM, together with the extensions I know well enough to estimate the number of operator they offer, provides about 1000 different operators which can be combined in almost every way you can imagine. Not to mention the usage of macros, different parameter settings etc. In order to capture all the side effects of those more or less complex processes you can build out of those building blocks, we would have to provide (hundreds of) thousands of examples. A lot of work, don't you think? And think of this: we already offer many samples, processes, videos, documentation and so on. But no matter how much we offer: there will always be somebody who is missing a sample for his or her concrete problem.

    But now you guys come into the game:
    • Install the RM Community Extension!
    • Create processes and samples!
    • Upload them again and share them with others!
    You got the idea? More information about the Community Extension can be found in the signature of my posts.

    All the best,
    Ingo

    P.S. (and beginning of advertisement at the same time): And now for all people who are still reading this and think "Yeah, but I need a solution for my problem NOW and not after somebody incidentally uploaded a sample process which might give me an idea of how to proceed...": you should really stop lamenting and become a customer of the RM Enterprise Edition and let the Rapid-I pros help you. End of advertisement.
  • haddockhaddock Member Posts: 849  Guru

    Greets, O most Pointy!!

    Part of the problem is that you guys are so polite; no matter how lazy, ignorant, rude, or aggressive the questioner you reply courteously and informatively. Posting code won't stop such people, they don't even read the help tab! Get the help finished, and we can all pitch in with *TF*  ;D
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,643  RM Founder
    Hey Captain,

    I already imagine how we will sing aloud those four letters to the tunes of YMCA and perform some weird dancing moves (especially for the letter "M")  8)

    But you are of course right: sometimes we definitely are too polite and helpful. From my point of view, there is nothing wrong with politeness but we will definitely not be able to keep up this level of support for free here in the forum. If community members help each other: that's fine and we are happy to provide the infrastructure. So we see our efforts here still as an attempt to build a stronger community which in the future does not has to rely on Rapid-I employees for every single and sometimes rather small question.

    By the way: we already have finished the optimization of the help texts of those operators which are most often used (thanks to the people who send us their operator usage statistics!). In terms of usage, the help texts are actually finished for more than 80% of all used operators counted over all processes. The thing is that this does of course still not help those people still searching for the operator to use - so we have some more work to finish  :D

    Cheers,
    Ingo
  • haddockhaddock Member Posts: 849  Guru

    I've suggested before that there should be some fee-based privileges for those that don't want to be seen as freeloaders, but don't need the full deal of a commercial license. I'd pay.

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,643  RM Founder
    I know - but there are not many noble people as you. Do you want to know how many requests we get every week asking for a 100% "discount" for the white paper which costs 40 Euro only? You wouldn't believe...

    However, in addition to our core business concentrating on the full service subscriptions for our key accounts we will also add those smaller fee-based privileges in near future. Right now we offer those privileges only to the customers of full Enterprise Editions. They currently encompass access to additional documentation, application white papers, and process packages. We are always open for additional ideas ;)

    One of the more exciting announcements I can make here is a new marketplace for Rapid-X products and extensions which will be started soon. Here developers can offer their own extensions, operators, scripts, or processes and Rapid-I will provide all the technical and commercial infrastructure for the necessary transactions.

    Cheers,
    Ingo
  • haddockhaddock Member Posts: 849  Guru
    Ingoids!,

    That mind reading software of yours works amazingly well over TCP, I need more details of the following yesterday....
    One of the more exciting announcements I can make here is a new marketplace for Rapid-X products and extensions which will be started soon. Here developers can offer their own extensions, operators, scripts, or processes and Rapid-I will provide all the technical and commercial infrastructure for the necessary transactions.
    On the loot front, it is often said that if you pay peanuts you get monkeys; it can also be observed that monkeys rarely pay even peanuts..

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,643  RM Founder

    That mind reading software of yours works amazingly well over TCP, I need more details of the following yesterday....
    yes, it's amazing, isn't it?  ;D

    There is not much I can tell you right now (and I of course want to generate some hype first  ;) ). We are currently sorting out some legal issues and finalizing the technical aspects in parallel.

    Personally I think that such a marketplace is really a great opportunity for those working a lot with RapidMiner and who want to introduce their results / development / ... to a greater audience. Everything is available at a central place and can be easily accessed. In addition, Rapid-I is providing all the marketing around those products and you all can participate from that. That's all information for now but we are going to write some blog posts in the near future to keep you updated what can be expected. So stay tuned and fire up your favorite IDE and create some new cool extensions which can then be offered on our marketplace.

    Cheers,
    Ingo
  • haddockhaddock Member Posts: 849  Guru
    Excellent, excellent.

    PS. Well done on the footie, our lot were s**T, and were well beaten!
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,643  RM Founder
    Thanks! Things might have been different from a 2:2 - but looking at the game in total I think the end result was more or less justified. At least I liked the match most of all matches I have seen during this World Cup - and I am not sure if there is coming anything better...
Sign In or Register to comment.