🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉

Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.

CLICK HERE TO DOWNLOAD

How to increase the RowCount

RPattelaRPattela Member Posts: 8 Contributor I
edited November 2018 in Help

Hello,

I need "RowCount" like below.

I have Min Number and Max Number per year.Here i have to find Missing values for specific year(Ex : In 2014 totally 2,3,4,5,6 values are missing).I tried with Loop values but i am unable to get like below.

Please help me out.

 

year Min Max RowCont
2014 1 7 1
2014 1 7 2
2014 1 7 3
2014 1 7 4
2014 1 7 5
2014 1 7 6
2014 1 7 7
2013 5 5 1
2013 5 5 2
2013 5 5 3
2013 5 5 4
2013 5 5 5
2012 -2 3 1
2012 -2 3 2
2012 -2 3 3
2011 90 52364 1
2011 90 52364 2
     

  

 

 

2011 90 52364 52363
2011 90 52364 52364

  

Best Answer

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,760   Unicorn
    Solution Accepted

    Ok, I understand. 

     

    I had to add a sample and another extract macro operator, and got it to work. Check this out.

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.3.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.3.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="7.3.001" expanded="true" height="68" name="Retrieve Row Count2" width="90" x="45" y="34">
    <parameter key="repository_entry" value="../data/Row Count2"/>
    </operator>
    <operator activated="true" class="multiply" compatibility="7.3.001" expanded="true" height="103" name="Multiply" width="90" x="179" y="187"/>
    <operator activated="true" class="aggregate" compatibility="7.3.001" expanded="true" height="82" name="Aggregate" width="90" x="313" y="34">
    <list key="aggregation_attributes"/>
    <parameter key="group_by_attributes" value="Max|year"/>
    </operator>
    <operator activated="true" class="extract_macro" compatibility="7.3.001" expanded="true" height="68" name="Extract Macro (2)" width="90" x="447" y="34">
    <parameter key="macro" value="examples"/>
    <list key="additional_macros"/>
    </operator>
    <operator activated="true" class="loop" compatibility="7.3.001" expanded="true" height="103" name="Loop" width="90" x="648" y="136">
    <parameter key="set_iteration_macro" value="true"/>
    <parameter key="iterations" value="%{examples}"/>
    <process expanded="true">
    <operator activated="true" class="extract_macro" compatibility="7.3.001" expanded="true" height="68" name="Extract Macro" width="90" x="313" y="34">
    <parameter key="macro" value="year"/>
    <parameter key="macro_type" value="data_value"/>
    <parameter key="attribute_name" value="year"/>
    <parameter key="example_index" value="%{iteration}"/>
    <list key="additional_macros"/>
    </operator>
    <operator activated="true" class="extract_macro" compatibility="7.3.001" expanded="true" height="68" name="Extract Macro (3)" width="90" x="447" y="34">
    <parameter key="macro" value="max"/>
    <parameter key="macro_type" value="data_value"/>
    <parameter key="attribute_name" value="Max"/>
    <parameter key="example_index" value="%{iteration}"/>
    <list key="additional_macros"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="7.3.001" expanded="true" height="103" name="Filter Examples" width="90" x="313" y="136">
    <list key="filters_list">
    <parameter key="filters_entry_key" value="year.eq.%{year}"/>
    </list>
    </operator>
    <operator activated="true" class="sample_bootstrapping" compatibility="7.3.001" expanded="true" height="82" name="Sample (Bootstrapping)" width="90" x="447" y="136">
    <parameter key="sample" value="absolute"/>
    <parameter key="sample_size" value="%{max}"/>
    <parameter key="use_weights" value="false"/>
    </operator>
    <operator activated="true" class="generate_id" compatibility="7.3.001" expanded="true" height="82" name="Generate ID" width="90" x="581" y="136"/>
    <operator activated="true" class="set_role" compatibility="7.3.001" expanded="true" height="82" name="Set Role" width="90" x="715" y="136">
    <parameter key="attribute_name" value="id"/>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="rename" compatibility="7.3.001" expanded="true" height="82" name="Rename" width="90" x="849" y="136">
    <parameter key="old_name" value="id"/>
    <parameter key="new_name" value="Row Count"/>
    <list key="rename_additional_attributes"/>
    </operator>
    <connect from_port="input 1" to_op="Extract Macro" to_port="example set"/>
    <connect from_port="input 2" to_op="Filter Examples" to_port="example set input"/>
    <connect from_op="Extract Macro" from_port="example set" to_op="Extract Macro (3)" to_port="example set"/>
    <connect from_op="Filter Examples" from_port="example set output" to_op="Sample (Bootstrapping)" to_port="example set input"/>
    <connect from_op="Sample (Bootstrapping)" from_port="example set output" to_op="Generate ID" to_port="example set input"/>
    <connect from_op="Generate ID" from_port="example set output" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_op="Rename" to_port="example set input"/>
    <connect from_op="Rename" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="source_input 3" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="append" compatibility="7.3.001" expanded="true" height="82" name="Append" width="90" x="782" y="136"/>
    <connect from_op="Retrieve Row Count2" from_port="output" to_op="Multiply" to_port="input"/>
    <connect from_op="Multiply" from_port="output 1" to_op="Aggregate" to_port="example set input"/>
    <connect from_op="Multiply" from_port="output 2" to_op="Loop" to_port="input 2"/>
    <connect from_op="Aggregate" from_port="example set output" to_op="Extract Macro (2)" to_port="example set"/>
    <connect from_op="Extract Macro (2)" from_port="example set" to_op="Loop" to_port="input 1"/>
    <connect from_op="Loop" from_port="output 1" to_op="Append" to_port="example set 1"/>
    <connect from_op="Append" from_port="merged set" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>
    IngoRM

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,760   Unicorn

    Something like this? I loaded your example set into a repository.

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.3.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.3.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="7.3.001" expanded="true" height="68" name="Retrieve Row Count" width="90" x="45" y="34">
    <parameter key="repository_entry" value="//Personal/Community Answers/data/Row Count"/>
    </operator>
    <operator activated="true" class="multiply" compatibility="7.3.001" expanded="true" height="103" name="Multiply" width="90" x="179" y="187"/>
    <operator activated="true" class="aggregate" compatibility="7.3.001" expanded="true" height="82" name="Aggregate" width="90" x="313" y="34">
    <list key="aggregation_attributes"/>
    <parameter key="group_by_attributes" value="Year"/>
    </operator>
    <operator activated="true" class="extract_macro" compatibility="7.3.001" expanded="true" height="68" name="Extract Macro (2)" width="90" x="447" y="34">
    <parameter key="macro" value="examples"/>
    <list key="additional_macros"/>
    </operator>
    <operator activated="true" class="loop" compatibility="7.3.001" expanded="true" height="103" name="Loop" width="90" x="648" y="136">
    <parameter key="set_iteration_macro" value="true"/>
    <parameter key="iterations" value="%{examples}"/>
    <process expanded="true">
    <operator activated="true" class="extract_macro" compatibility="7.3.001" expanded="true" height="68" name="Extract Macro" width="90" x="313" y="34">
    <parameter key="macro" value="year"/>
    <parameter key="macro_type" value="data_value"/>
    <parameter key="attribute_name" value="Year"/>
    <parameter key="example_index" value="%{iteration}"/>
    <list key="additional_macros"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="7.3.001" expanded="true" height="103" name="Filter Examples" width="90" x="313" y="136">
    <list key="filters_list">
    <parameter key="filters_entry_key" value="Year.eq.%{year}"/>
    </list>
    </operator>
    <operator activated="true" class="generate_id" compatibility="7.3.001" expanded="true" height="82" name="Generate ID" width="90" x="447" y="136"/>
    <operator activated="true" class="set_role" compatibility="7.3.001" expanded="true" height="82" name="Set Role" width="90" x="581" y="136">
    <parameter key="attribute_name" value="id"/>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="rename" compatibility="7.3.001" expanded="true" height="82" name="Rename" width="90" x="715" y="136">
    <parameter key="old_name" value="id"/>
    <parameter key="new_name" value="Row Count"/>
    <list key="rename_additional_attributes"/>
    </operator>
    <connect from_port="input 1" to_op="Extract Macro" to_port="example set"/>
    <connect from_port="input 2" to_op="Filter Examples" to_port="example set input"/>
    <connect from_op="Filter Examples" from_port="example set output" to_op="Generate ID" to_port="example set input"/>
    <connect from_op="Generate ID" from_port="example set output" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_op="Rename" to_port="example set input"/>
    <connect from_op="Rename" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="source_input 3" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="append" compatibility="7.3.001" expanded="true" height="82" name="Append" width="90" x="782" y="136"/>
    <connect from_op="Retrieve Row Count" from_port="output" to_op="Multiply" to_port="input"/>
    <connect from_op="Multiply" from_port="output 1" to_op="Aggregate" to_port="example set input"/>
    <connect from_op="Multiply" from_port="output 2" to_op="Loop" to_port="input 2"/>
    <connect from_op="Aggregate" from_port="example set output" to_op="Extract Macro (2)" to_port="example set"/>
    <connect from_op="Extract Macro (2)" from_port="example set" to_op="Loop" to_port="input 1"/>
    <connect from_op="Loop" from_port="output 1" to_op="Append" to_port="example set 1"/>
    <connect from_op="Append" from_port="merged set" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>
  • RPattelaRPattela Member Posts: 8 Contributor I

     

    Thank you Thomas. The work process is working exactly to the above table but i need like below..

    Actually my data will be like below.Just i have year wise min and max value for document.

     

    year doc      Min     Max
    2014 2014-A1143       1      7
    2013 2013-F446      5      5
    2012 2012-1154     -2      3
    2011 2011-A184     90   52364

     

    Now i have to find the missing values for document for example in 2014 totally 5 missing are there(2,3,4,5,6),

    like in 2011 missings are 1,2,..89,91,92,....52363.

    If i get the table like below i can easily find the missing values.

     

     

    year doc Min Max Rowcount
    2014 2014-A1143 1 7 1
    2014 2014-A1143 1 7 2
    2014 2014-A1143 1 7 3
    2014 2014-A1143 1 7 4
    2014 2014-A1143 1 7 5
    2014 2014-A1143 1 7 6
    2014 2014-A1143 1 7 7
    2013 2013-F446 5 5 1
    2013 2013-F446 5 5 2
    2013 2013-F446 5 5 3
    2013 2013-F446 5 5 4
    2013 2013-F446 5 5 5
    2012 2012-1154 -2 3 1
    2012 2012-1154 -2 3 2
    2012 2012-1154 -2 3 3
    2011 2011-A184 90 52364 1
    2011 2011-A184 90 52364 2
    2011 2011-A184 90 52364 3
    2011 2011-A184 90 52364 4
    2011 2011-A184 90 52364 |
    2011 2011-A184 90 52364 |
    2011 2011-A184 90 52364 |
    2011 2011-A184 90 52364 52362
    2011 2011-A184 90 52364 52364

     

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,760   Unicorn

    In what columns are your missing values? Or are you just trying to make a selection of rows which contain missing values and output them?

  • RPattelaRPattela Member Posts: 8 Contributor I

    I am trying to make a rows which contains missing values(RowCount). 

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,760   Unicorn

    If you want to just show all the rows that have missing data in any column, then use Filter Examples and set the paramter to "missing attributes."  That will filter all the rows that have missing values.  

     

    OR, if you specifically want to see only the missing RowCount values, then use the Filter Examples, and go into the custom paramter and set RowCount = to missing.

  • RPattelaRPattela Member Posts: 8 Contributor I

    Thank you for looking into this however my requirement is little different from what is suggested.

     

    For example if we look at row number 1 in below table min value is 1 and max value is 7, so i want the row to be repeated till the max value that means from 2nd row 6 rows should get added with the same value as in 1st row. once the rows are added, now the 2nd rows moves to 8th row again from there min value is 90 and max value is 52364 so irrespective of min values rows should get addded till the max value. Please see the input and output tables.

     

    Input

    year doc Min Max
    2014 2014-A1143 1 5
    2011 2011-A184 10 15

     

    Output

    year doc Min Max Number
    2014 2014-A1143 1 5 1
    2014 2014-A1143 1 5 2
    2014 2014-A1143 1 5 3
    2014 2014-A1143 1 5 4
    2014 2014-A1143 1 5 5
    2011 2011-A184 10 15 1
    2011 2011-A184 10 15 2
    2011 2011-A184 10 15 3
    2011 2011-A184 10 15 4
    2011 2011-A184 10 15 5
    2011 2011-A184 10 15 6
    2011 2011-A184 10 15 7
    2011 2011-A184 10 15 8
    2011 2011-A184 10 15 9
    2011 2011-A184 10 15 10
    2011 2011-A184 10 15 11
    2011 2011-A184 10 15 12
    2011 2011-A184 10 15 13
    2011 2011-A184 10 15 14
    2011 2011-A184 10 15 15


    Thank you

     

  • RPattelaRPattela Member Posts: 8 Contributor I

    Thank you Thomas. It's working for my data. :-)

Sign In or Register to comment.