Options

Iterating through example subsets

darkobodnarukdarkobodnaruk Member Posts: 4 Contributor I
edited November 2018 in Help
Hi, is there a way to create a process to do the following:

- start with a dataset with 5.000 examples
- iterate through subsets of 1.000 examples (sequentially, examples 0-999, then 1000-1999 and so on) and run the same classification algorithm on each subset
- write the results and performance of each subset to a file
- (if possible, average performance over all subsets)

I know about ExampleRangeFilter. I'm guessing it might have something to do with macros, where you can define variable parameters, but don't know how to do a loop/iteration?

I'm experimenting with ParameterIteration now, but if I want to vary two parameters, first_example (0, 1000, 2000...) and last_example (999, 1999...) for ExampleRangeFilter, I get 5x5=25 iterations instead of only 5...

regards,
darko

Answers

  • Options
    haddockhaddock Member Posts: 849 Maven
    Hi,

    What you describe is called validation, like this...
    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
            <parameter key="target_function" value="random classification"/>
            <parameter key="number_examples" value="5000"/>
        </operator>
        <operator name="SlidingWindowValidation" class="SlidingWindowValidation" expanded="yes">
            <operator name="LibSVMLearner" class="LibSVMLearner">
                <list key="class_weights">
                </list>
            </operator>
            <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                <operator name="ModelApplier" class="ModelApplier">
                    <list key="application_parameters">
                    </list>
                </operator>
                <operator name="ClassificationPerformance" class="ClassificationPerformance" breakpoints="after">
                    <parameter key="accuracy" value="true"/>
                    <list key="class_weights">
                    </list>
                </operator>
            </operator>
        </operator>
    </operator>
    Lots of examples on Help->RapidMiner Tutorial.

  • Options
    darkobodnarukdarkobodnaruk Member Posts: 4 Contributor I
    Now that you put it like that, it IS validation. Not sure why I tried to make it more complicated. :)

    But I didn't know about SlidingWindowValidation before, thanks a lot!
Sign In or Register to comment.