Options

Reducing example set using average

hung9022hung9022 Member Posts: 13 Contributor I
edited November 2018 in Help

Hello,

Attached to this is the image of my example set, it shows i have 683 examples and 343 regular attributes. I would like to reduce the examples set down to 100 examples and keep the number of attributes, with each of the new example is the average of a group of old example. For example, 683/100 = 6.83, so each new example will be the average for 6 of the old example. I know it is similar to moving average, but with moving average operator can only select 1 column and the result of that operator is just the average continuously. I also plan to use this on different example set and each of them have varied size from 500 examples to 2000 examples.

Regards,

Best Answer

  • Options
    tftemmetftemme Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, Member Posts: 164 RM Research
    Solution Accepted

    Hi @hung9022,

     

    You can use the Process Windows operator (from the Time Series Extension, which is bundled with RapidMiner Studio since 9.0.0) with a window size of 6 and no overlapping windows selected (deselect create horizon). Inside the Process Windows you can use Extract Aggregates (also from Time Series Extension) to extract the average (and other aggregated values). Use Append to collect all results.

     

    Hopes this helps,

    Best regards,
    Fabian

     

    PS.: Since 9.0.0 the Moving Average, as well as Process Windows and Extract Aggregates work on several attributes at once

Answers

  • Options
    hung9022hung9022 Member Posts: 13 Contributor I

    Hi @tftemme,

    Thanks, that is what i am looking for. Incidentally, do you know how to set the macros so that I can loop example set of different sizes to reduce to 100 examples using the above method? The size of these example set ranging from 500 to 2000 examples .

    Regards,

  • Options
    tftemmetftemme Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, Member Posts: 164 RM Research

    You can use Extract Macro to extract the number of examples, and then Generate Macro to calculate the window size (floor(eval(%{number_of_examples})/100)

Sign In or Register to comment.