About the parameter "local_random_seed" in XValidation Operator

ligaliga Member Posts: 4 Contributor I

   RM is a fantastic tool for data mining research and application. Thanks for your good work.   Here I have a problem when I use the 04_XValidation_Nominal.xml sample. Theoretically, when the "sampling_type" parameter is set to be "stratified sampling" and "local_random_seed" is set to be -1,  the results should be a little different in different iteration of running, since each time the training sample and test sample are different in each fold validation. In my test, the result has no change.  I  tried several other example source, it still has no change. Could anybody tell me what's the problem. Thanks again.


  • steffensteffen Member Posts: 347 Maven
    Hello liga

    setting the "local_random_seed" to -1 means: use the global random seed. The global random seed is initialized everytime you start the process. This is necessary because otherwise you were not able to recompute your results. However, running XValidation two times within the SAME process causes different results.

    See this setup here (simply copy and paste in the xml-tab)
    <operator name="Root" class="Process" expanded="yes">
        <description text="#ylt#p#ygt#This experiment is very similar to the experiment #yquot#03_XValidation_Numerical.xml#yquot#. The basic experiment setup is exactly the same, i.e. the first inner operator must produce a model from the given training data set and the second inner operator must be able to handle this model and the test data and must provide a PerformanceVector. #ylt#/p#ygt# In contrast to the previous experiment we now use a classification learner (J48) which is evaluated by several nominal performance criteria.#ylt#/p#ygt#  #ylt#p#ygt# The cross validation building block is very common for many (more complex) RapidMiner experiments. However, there are several more validation schemes available in RapidMiner which will be dicussed in the next sample experiments. #ylt#/p#ygt#"/>
        <operator name="ExampleSource" class="ExampleSource">
            <parameter key="attributes" value="..\data\labor-negotiations.aml"/>
        <operator name="MissingValueReplenishment" class="MissingValueReplenishment">
            <list key="columns">
        <operator name="IteratingOperatorChain" class="IteratingOperatorChain" expanded="yes">
            <parameter key="iterations" value="2"/>
            <operator name="XValidation" class="XValidation" expanded="yes">
                <parameter key="keep_example_set" value="true"/>
                <parameter key="number_of_validations" value="5"/>
                <operator name="NearestNeighbors" class="NearestNeighbors">
                <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                    <operator name="ModelApplier" class="ModelApplier">
                        <list key="application_parameters">
                    <operator name="ClassificationPerformance" class="ClassificationPerformance">
                        <list key="class_weights">
                        <parameter key="classification_error" value="true"/>
        <operator name="IOConsumer" class="IOConsumer">
            <parameter key="io_object" value="ExampleSet"/>
    hope this was helpful

  • ligaliga Member Posts: 4 Contributor I
    Hi, steffen

        Thanks for your instant reply. Your solution did help.
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder

    only an additional side note: you could also change the global random seed of the root operator to -1 which means that in this case a different seed would be used for every new run.

  • ligaliga Member Posts: 4 Contributor I
    Hi, lngo,

      Thank you for a new optional solution to my problem and at the same time, this one remove all my muddle.


Sign In or Register to comment.