Options

Duplicate attributes in value series pre-processing

awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
edited December 2019 in Help
I'm trying to learn how value series work and I'm having trouble with a particular process that I cannot get to do what I want; it errors with "duplicate attributes".

The attached example shows this. It is trying to find the maximum frequency index and its amplitude for a number of windows within an example.

<operator name="Root" class="Process" expanded="yes">
   <parameter key="random_seed" value="-1"/>
   <operator name="GenerateSeriesIOObjects" class="OperatorChain" expanded="no">
       <operator name="Generate a sine wave SeriesIOObject" class="SinusGenerator">
           <parameter key="number_of_values" value="2000"/>
           <list key="frequency">
             <parameter key="101" value="1.0"/>
           </list>
       </operator>
       <operator name="Visualizer (2)" class="Visualizer" activated="no">
       </operator>
       <operator name="Convert into an ExampleSet" class="SeriesObject2ExampleSet">
       </operator>
       <operator name="Window into examples" class="MultivariateSeries2WindowExamples">
           <parameter key="window_size" value="1002"/>
       </operator>
       <operator name="Add a label" class="WindowExamples2ModelingData">
           <parameter key="label_name_stem" value="sinus_dim_1"/>
           <parameter key="relative_transformation" value="false"/>
       </operator>
       <operator name="Delete the Id" class="AttributeFilter">
           <parameter key="condition_class" value="attribute_name_filter"/>
           <parameter key="parameter_string" value="sinus_index"/>
           <parameter key="invert_filter" value="true"/>
           <parameter key="apply_on_special" value="true"/>
       </operator>
       <operator name="Add an easier to read Id" class="IdTagging">
       </operator>
       <operator name="Change examples into seriesIO objects" class="Single2Series">
       </operator>
   </operator>
   <operator name="ValueSeriesPreprocessing" class="ValueSeriesPreprocessing" expanded="yes">
       <operator name="Branch" class="Branching" expanded="yes">
           <parameter key="keep_only_last" value="false"/>
           <operator name="Find the maximum frequency within each window" class="OperatorChain" expanded="yes">
               <operator name="Split each example into 5 windows" class="Windowing" expanded="yes">
                   <parameter key="step_size" value="200"/>
                   <parameter key="window_size" value="200"/>
                   <operator name="OperatorChain (4)" class="OperatorChain" expanded="yes">
                       <operator name="DiscreteFourierTransform (2)" class="DiscreteFourierTransform">
                       </operator>
                       <operator name="Max" class="Max">
                       </operator>
                   </operator>
               </operator>
               <operator name="NullGenerator (4)" class="NullGenerator">
               </operator>
           </operator>
           <operator name="Find the maximum frequency index within each window" class="OperatorChain" expanded="yes">
               <operator name="Split each example into 5 windows (2)" class="Windowing" expanded="yes">
                   <parameter key="step_size" value="200"/>
                   <parameter key="window_size" value="200"/>
                   <operator name="OperatorChain (3)" class="OperatorChain" expanded="yes">
                       <operator name="DiscreteFourierTransform" class="DiscreteFourierTransform">
                       </operator>
                       <operator name="MaxIndex" class="MaxIndex">
                       </operator>
                   </operator>
               </operator>
               <operator name="NullGenerator (3)" class="NullGenerator">
               </operator>
           </operator>
       </operator>
   </operator>
</operator>

If you run it, you will get the duplicate attribute error. If you disable either one of the branches, the example will succeed. In one case, the maximum index within each window will be found, in the other case, its maximum amplitude. Try as I might, I cannot get both to be output at the same time.

Before I embark on a workaround to perform the process twice and merge example sets, is there any way what I want to achieve can be done?

Tagged:

Answers

  • Options
    awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    After a bit of fiddling around, I crafted the following workaround.

    <operator name="Root" class="Process" expanded="yes">
       <parameter key="random_seed" value="-1"/>
       <operator name="GenerateSeriesIOObjects" class="OperatorChain" expanded="no">
           <operator name="Generate a sine wave SeriesIOObject" class="SinusGenerator">
               <parameter key="number_of_values" value="2000"/>
               <list key="frequency">
                 <parameter key="101" value="1.0"/>
               </list>
           </operator>
           <operator name="Convert into an ExampleSet" class="SeriesObject2ExampleSet">
           </operator>
           <operator name="Window into examples" class="MultivariateSeries2WindowExamples">
               <parameter key="window_size" value="1002"/>
           </operator>
           <operator name="Add a label" class="WindowExamples2ModelingData">
               <parameter key="label_name_stem" value="sinus_dim_1"/>
               <parameter key="relative_transformation" value="false"/>
           </operator>
           <operator name="Delete the Id" class="AttributeFilter">
               <parameter key="condition_class" value="attribute_name_filter"/>
               <parameter key="parameter_string" value="sinus_index"/>
               <parameter key="invert_filter" value="true"/>
               <parameter key="apply_on_special" value="true"/>
           </operator>
           <operator name="Add an easier to read Id" class="IdTagging">
           </operator>
           <operator name="Change examples into seriesIO objects" class="Single2Series">
           </operator>
       </operator>
       <operator name="StoreSeries" class="IOStorer">
           <parameter key="name" value="MaxSeries"/>
           <parameter key="io_object" value="ExampleSet"/>
           <parameter key="remove_from_process" value="false"/>
       </operator>
       <operator name="FindMax" class="ValueSeriesPreprocessing" expanded="no">
           <operator name="Branch" class="Branching" expanded="yes">
               <parameter key="keep_only_last" value="false"/>
               <operator name="Find the maximum frequency within each window" class="OperatorChain" expanded="no">
                   <operator name="Split each example into 5 windows" class="Windowing" expanded="yes">
                       <parameter key="step_size" value="200"/>
                       <parameter key="window_size" value="200"/>
                       <operator name="OperatorChain (4)" class="OperatorChain" expanded="yes">
                           <operator name="DiscreteFourierTransform (2)" class="DiscreteFourierTransform">
                           </operator>
                           <operator name="Max" class="Max">
                           </operator>
                       </operator>
                   </operator>
                   <operator name="NullGenerator (4)" class="NullGenerator">
                   </operator>
               </operator>
           </operator>
       </operator>
       <operator name="StoreMax" class="IOStorer">
           <parameter key="name" value="Max"/>
           <parameter key="io_object" value="ExampleSet"/>
           <parameter key="remove_from_process" value="false"/>
       </operator>
       <operator name="RetrieveSeries" class="IORetriever">
           <parameter key="name" value="MaxSeries"/>
           <parameter key="io_object" value="ExampleSet"/>
           <parameter key="remove_from_store" value="false"/>
       </operator>
       <operator name="FindMaxIndex" class="ValueSeriesPreprocessing" expanded="no">
           <operator name="Branch (2)" class="Branching" expanded="yes">
               <parameter key="keep_only_last" value="false"/>
               <operator name="Find the maximum frequency index within each window (2)" class="OperatorChain" expanded="no">
                   <operator name="Split each example into 5 windows (4)" class="Windowing" expanded="yes">
                       <parameter key="step_size" value="200"/>
                       <parameter key="window_size" value="200"/>
                       <operator name="OperatorChain (5)" class="OperatorChain" expanded="yes">
                           <operator name="DiscreteFourierTransform (4)" class="DiscreteFourierTransform">
                           </operator>
                           <operator name="MaxIndex (2)" class="MaxIndex">
                           </operator>
                       </operator>
                   </operator>
                   <operator name="NullGenerator (5)" class="NullGenerator">
                   </operator>
               </operator>
           </operator>
       </operator>
       <operator name="ExampleSetJoin" class="ExampleSetJoin">
           <parameter key="remove_double_attributes" value="false"/>
       </operator>
    </operator>
    However, it would be nice if I didn't have to do this - is there a better way?
Sign In or Register to comment.