setting macros within a loop

shoshanshoshan Member Posts: 3 Newbie
edited June 2019 in Help
Hi All
I'm trying to extract macros within a loop and name them individually using the %{loop_value} macro
I then need to create an additional macro based on those macros. I'm using the Extract macro operator for all.
It seemed whenever I run my process and break it during the loop, the macros are formed correctly, but if I remove all breakpoints, the process fails and claims the loop macros were not created.
help anyone?

Best Answer

Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Hi @shoshan,

    In order we understand what you're trying to do, can you share your process and your data ?

    Regards,

    Lionel
  • shoshanshoshan Member Posts: 3 Newbie
    I have a population that is comprised of 3 possible values for a specific attribute: First, Second and Third. I need to find the size of each population and set the smallest among them as a macro. I started by using the "Loop Values" operator for attribute I need. Per loop iteration I filter the examples which equals the loop value and then try to create the macro Size_%{loop_value}.
    By the end of the loop I expect to have 3 macros - Size_First, Size_Second and Size_Third. 
    I them attempt to create another macro:
    Min_Size = min(eval(%{Size_First}), eval(%{Size_Second}),eval(%{Size_Third}))

    And here's the problem - if I use a breakpoint within the loop, all 3 macros are formed correctly and eventually Min_Size macro is exactly as I expect it to be.
    Once I remove all breakpoints though, the process fails and the error message reads "Size_First" is unknown. I can see in the macros view that indeed only Size_Third was created.
    How can that be?!?
  • tftemmetftemme Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, Member Posts: 164 RM Research
    Hi @shoshan.

    My first guess is that this is a problem with the parallel execution of the Loop Values operator. When you use breakpoints, the execution is forced to be not parallelized, so that you have a defined order in which the macros are created. When you remove them you run the operator parallelized and the order of the iterations is not defined. Probably (by random effects) your third macro is created before your first one is created, so it fails to access the "Size_First" macro. Deactivate the 'enable parallel execution' (you may need to 'show advanced parameters' if you not see the parameter). This should solve the issue. If not please post the xml of your process (and best your data or a sample of your data if possible). Then it is way easier for all of us to have a look and find an issue.

    Another point: You may want to look at the Extract Statistics operator from the operator toolbox extension (install it via the Marketplace). It directly gives you the 'Least' occurring value of nominal attributes. And when I correctly understand your problem, this is what you are after, or?

    Best regards, and hopes this helps
    Fabian
  • shoshanshoshan Member Posts: 3 Newbie
    Hi @tftemme
    Thank you, that solved the problem.
    Per your assumption regarding the origin of the problem - why would the macros be overwritten when they are named differently?
    I call for the final generating macro (Min_Size) outside the loop so the order should not matter either.
    so the mystery remains :)
    Disabling parallel worked anyway.
    And I also found the solution via the extract macro statistics - I just used the count option per value 3 times. I will download the extract statistics from the marketplace as well.
    Thanks again!
  • tftemmetftemme Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, Member Posts: 164 RM Research
    Hi, 

    Imagine you are using a Loop operator in parallel with one macros (for example the iteration macro). When you start now the first four iterations in parallel (in four different threads), you cannot use one general macro for all iterations. Every thread needs its own macro value (called iteration in the thread) with the proper value set. So the macros within one iteration can only be exist within the iteration itself. Only the macro which is used in the main thread of the process (which also of course performs a part of the iterations of the loop) can be accessed after the loop is finished. 
    You could use a parallel loop when you "initialize" the three macros with default values (e.g. 0) before the loop. This should also work. 

    Oh yeah, forgot that Extract Macro also can extract statistics of attributes. Even easier than Extract Statistics operator from the toolbox. Nevertheless the operator toolbox has several operators which make life and work with RapidMiner easier. Always worth to download it (ok, shouldn't say anything else, cause I implemented a bunch of them ;-)

    Best regards,
    Fabian
Sign In or Register to comment.