Performance: Loop Values vs Loop Examples

CharlieFirpoCharlieFirpo Member Posts: 48 Contributor II
edited July 2019 in Help
Dear All!

I have to process an ExampleSet that has 100.000 examples. To do this, I use a Loop operator, and in it I process the examples. At the first iteration, I process the first example, at the second iteration I process the second example and so on. For this I use a Filter Example within the Loop operator. At the Filter Example operator I can use the Loop operator's macro to filter out the correct example.
If I use Loop Values then I use a value-typed macro, so in the Filter Example there is a condition that compares values (the type is text, the values are about 20-50 character long). And if I use Loop Example, then the loop macro is an index-typed macro and in the Filter Example operator the comarison happens between numbers/integers (of course I need an ID to do this).

So I think Loop Example is more effective in performance. Am I rigth? Are there any tests or manuals, tutorials that show if there is any difference in performance between the Loop Values and Loop Examples operators?
Of course I can do some test for me, but some official reference would be appriciated!

Thank you!!!

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    If you want to loop over every single operator, use Loop Examples in combination with a Filer Example Range operator (not Filter Examples). That operator is even faster since it does not compare anything, but just extracts examples based on the position in the data - it does not even need an id.

    Loop Values, on the other hand, should be used if you want to loop over different values of an attribute, e.g. over the different classes stored in a label attribute.

    Best regards,
    Marius
  • CharlieFirpoCharlieFirpo Member Posts: 48 Contributor II
    Thank you!
    Where should I use Filter Example Range? Within the Loop Examples?

    Within the Loop I have several operators and all operator needs in all iteration. The Loop's input is an exampleSet that has 100.000 examples (rows) and I have to process them one-by-one. So in one iteration I want process only one example. With Filter Example Range, how can I select which example should be processed within the Loop in an iteration?
    If I don't use Filter Example in the Loop then all examples of the exampleSet will be processed in one iteration. And because I have 100.000 examples so I will have 100.000 iterations.
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi,

    yes, place it into Loop Examples, and for both values enter %{example} (or however you named the iteration_macro of Loop Examples).

    Best regards,
    Marius
Sign In or Register to comment.