'Aggregate' by group is BUGGING OUT!

781194025781194025 Member Posts: 32 Contributor I
edited June 2019 in Help

how do we delete threads?

 

i gave up on aggregate by group and moved onto LOOP VALUES -> AGGREGATE instead cuz.... that's all i could 'make work'.

 

who knows if its actually working tho... 

Answers

  • Edin_KlapicEdin_Klapic Moderator, Employee, RMResearcher, Member Posts: 299 RM Data Scientist

    Hi @781194025,

     

    exactly for this reason the Operator "Group into Collection" in the extension 'Operator Toolbox' exists.

    The result is a collection of ExampleSets grouped by one Attribute.

     

    Best,

    Edin

  • 781194025781194025 Member Posts: 32 Contributor I
    Hey ! Thanks for pointing that out!

    But, seriously, aggregate is BUGGED!! Even when I split the data (by groups) and then aggregate it, the aggregated examples will gather data from GOD KNOWS WHERE!!!

    I'll try Group Into Collection now, I suppose. But I don't want a collection, I want to eliminate redundant rows!!!
  • Pavithra_RaoPavithra_Rao Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 123 RM Data Scientist

    Hey,

     

    Would it be possible to share the process XML code here so that we can step through the process and see what is the error?

     

    Cheers,

    Pavithra

  • 781194025781194025 Member Posts: 32 Contributor I
    THE SAME THING HAPPENS WITH 'GROUP BY COLLECTION'!!!!

    I group by URL, "loop collection" and run aggregate in the loop.
    'Aggregate' should ONLY work on the 3 examples grouped by url in that collection. But somehow it aggregates data from the original set!!!!

    AGGREGATE IS BUGGED!!!!
    IN FACT, when I 'aggregate' a SINGLE EXAMPLE in a completely new process, after saving it as it's own independent single example set, it STILL remembers data to 'aggregate'.

    I have been trying to do something VERY SIMPLE for literally a month now. Combine two example sets, grouped by url, where the missing fields 'fill in the blanks' of each other. I cannot make it happen even on 2 examples, let alone 2 example sets!!!

    It's my fault for gathering data so haphazardly I suppose, but it's tricky because I often exceed my API limits and end up with half-completed data sets that need to be joined with other half-completed ones!!

    I don't need to share my process code, just look at these screenshots!!
  • zprekopcsakzprekopcsak RapidMiner Certified Expert, Member Posts: 47 Guru

    Hi,

    I am not sure if I understand what you are trying to do, but don't you just need the Remove Duplicates operator keeping one record of every URL?

    In your example, you are taking the mode of 100% missing values. The attribute has the metadata about all the possible values and it finds that all of those potential values appear zero times. There is no clear winner so it will just pick one of the values as the mode. You could argue that it should keep it missing instead. Did I understand correctly that this would be the expected behaviour from your perspective?

    Thanks, Zoltan

  • zprekopcsakzprekopcsak RapidMiner Certified Expert, Member Posts: 47 Guru

    Also, the Aggregate operator has a parameter called "ignore missings" that is set to true by default. If you set it to false then do you get the result that you expect?

    Best, Zoltan

  • 781194025781194025 Member Posts: 32 Contributor I
    I'm trying to combine examples by URL where the examples have missing fields, without losing any data from the fields.

    Look at attached photo "4 examples for aggregation": I want ALL that data combined in 1 row.

    I seemed to have 'partially' solved the problem by simply "removing useless attributes" before running aggregate.

    The pictures I previously attached clearly show 'aggregate' generating data out of thin air. Yes, I did try all the check-boxes.

    My guess is aggregate draws data from the Repository or from the Example Set it was split off from, even if it's saved in an entirely seperate Example Set.

    Anyway I'm done spending time and effort trying to report this bug when I'm only met with skepticism and cries of user error. Especially since I've found a way around it.
  • 781194025781194025 Member Posts: 32 Contributor I
    Attached is a simple process that should show the bug.

    Make sure the data you're using is from a larger example set, split off into a subgroup by ID.
  • zprekopcsakzprekopcsak RapidMiner Certified Expert, Member Posts: 47 Guru

    Thanks for the explanation, I think now I get what you are trying to do. I was not sceptical, just did not understand fully.

    Believe me that it does not pull the data from thin air. :) Even if you filter and save a dataset, each nominal attribute remembers all the potential values it ever had. This is quite useful in many cases so we do not intend to change that.

    However, when you calculate mode on an group that only has missing values, then mode is counting the occurances of all potential values. All of them have zero occurances, so it is doing what it needs to do in case of a draw: picks one. This is a bug, and we need to make sure that if all values have zero occurances then it picks missing ("?") as a result. I have filed this in our internal bug tracker and it will be fixed in one of the upcoming releases.

    Thanks for bringing this up!

    Best, Zoltan

  • Edin_KlapicEdin_Klapic Moderator, Employee, RMResearcher, Member Posts: 299 RM Data Scientist

    Hi @781194025,

     

    Until the bug is fixed perhaps the Operator "Materialize Data" can help.

    If you have filtered a dataset and are sure that you do not want to keep the potential values you can use this Operator right after your filtering steps / before your aggregations. It basically recreates the Metadata on the available data.

     

    Best regards,

    Edin

     

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    Hi @zprekopcsak @Edin_Klapic - if this is a recognized bug, can I move this thread to "Product Feedback" so that Balazs H. can manage?


    Scott

  • sdimasdima Member Posts: 2 Contributor I
    Has this issue been resolved?

    I am aggregating by grouping multiple factors and one is an integer. After the aggregation, the integer disappears. 
    Help?
  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager
    hi @sdima so can you please post your XML and your data so we can see what you're doing?

    Scott
  • sdimasdima Member Posts: 2 Contributor I
    Hi @sgenzer

    Sure. Here it is

    <?xml version="1.0" encoding="UTF-8"?><process version="9.0.002">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.0.002" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="false" class="cloud_connectivity:read_google_storage" compatibility="9.0.000" expanded="true" height="68" name="WARRANTY GOOGLE" width="90" x="45" y="85">
            <parameter key="connection" value="VISTA"/>
            <parameter key="file" value="jlr-dl-dms-ccr/Vista/WARRANTY_CLAIMS_DATA_SD.csv"/>
            <description align="center" color="transparent" colored="false" width="126">STAGE I - Read Google Storage</description>
          </operator>
          <operator activated="false" class="read_csv" compatibility="9.0.002" expanded="true" height="68" name="Read CSV" width="90" x="179" y="85">
            <parameter key="column_separators" value=","/>
            <parameter key="grouped_digits" value="true"/>
            <list key="annotations"/>
            <parameter key="locale" value="English (United Kingdom)"/>
            <list key="data_set_meta_data_information"/>
          </operator>
          <operator activated="false" class="write_csv" compatibility="9.0.002" expanded="true" height="82" name="Write CSV" width="90" x="313" y="85">
            <parameter key="csv_file" value="H:\MS&amp;S\Commercial &amp; Customer Retention\Business Development\Volume Opitimisation\Analytics\Data sets\PARTS\Warranty\9. Warranty_Google.csv"/>
          </operator>
          <operator activated="false" class="read_csv" compatibility="9.0.002" expanded="true" height="68" name="WTY READ" width="90" x="514" y="85">
            <parameter key="csv_file" value="C:\Users\sdima\Desktop\Analytics\Weekly Daily Sales\WTY\9. Warranty_Google.csv"/>
            <parameter key="skip_comments" value="true"/>
            <parameter key="date_format" value="yyyy-MM-dd"/>
            <list key="annotations"/>
            <parameter key="encoding" value="windows-1252"/>
            <list key="data_set_meta_data_information">
              <parameter key="0" value="Brand_Name_Formatted.true.polynominal.attribute"/>
              <parameter key="1" value="Claim_Region_Description.true.polynominal.attribute"/>
              <parameter key="2" value="Claim_Sub_Region_Description.true.polynominal.attribute"/>
              <parameter key="3" value="Claim_Country_Description.true.polynominal.attribute"/>
              <parameter key="4" value="Claim_Retailer_CI_Code.true.polynominal.attribute"/>
              <parameter key="5" value="Claim_Retailer_CI_Code_Description.true.polynominal.attribute"/>
              <parameter key="6" value="Engineering_Part_Number.true.polynominal.attribute"/>
              <parameter key="7" value="Item_Material.true.polynominal.attribute"/>
              <parameter key="8" value="Item_Material_Description.true.polynominal.attribute"/>
              <parameter key="9" value="Accepted_Date_Fix.true.date.attribute"/>
              <parameter key="10" value="Parts_Cost_Local_Currency.true.real.attribute"/>
              <parameter key="11" value="Parts_Cost_GBP_Budget_Rate.true.real.attribute"/>
              <parameter key="12" value="Parts_Quantity.true.real.attribute"/>
            </list>
            <parameter key="read_not_matching_values_as_missings" value="false"/>
            <description align="center" color="transparent" colored="false" width="126">STAGE II - Filter UK&lt;br&gt;only&lt;br/&gt;LOCAL</description>
          </operator>
          <operator activated="false" class="filter_examples" compatibility="9.0.002" expanded="true" height="103" name="UK MARKET" width="90" x="648" y="85">
            <list key="filters_list">
              <parameter key="filters_entry_key" value="Claim_Country_Description.is_in.United Kingdom;"/>
              <parameter key="filters_entry_key" value="Accepted_Date_Fix.ge.01/01/2019"/>
              <parameter key="filters_entry_key" value="Accepted_Date_Fix.le.01/31/2019"/>
              <parameter key="filters_entry_key" value="Brand_Name_Formatted.is_in.Jaguar;"/>
              <parameter key="filters_entry_key" value="Parts_Cost_Local_Currency.is_not_missing."/>
            </list>
            <description align="center" color="transparent" colored="false" width="126">UK ONLY&lt;br&gt;JAN ONLY&lt;br&gt;JAGUAR&lt;br/&gt;No blank cost</description>
          </operator>
          <operator activated="false" class="write_csv" compatibility="9.0.002" expanded="true" height="82" name="Write CSV (2)" width="90" x="782" y="85">
            <parameter key="csv_file" value="C:\Users\sdima\Desktop\Analytics\Weekly Daily Sales\WTY\8. UK JAN JAG.csv"/>
          </operator>
          <operator activated="true" class="read_excel" compatibility="9.0.002" expanded="true" height="68" name="REV READ" width="90" x="45" y="595">
            <parameter key="excel_file" value="C:\Users\sdima\Desktop\Analytics\Weekly Daily Sales\BO DATA\UK - Jaguar - Jan 19.xls"/>
            <parameter key="imported_cell_range" value="A2:L10485776"/>
            <list key="annotations"/>
            <parameter key="date_format" value="MM.yyyy"/>
            <list key="data_set_meta_data_information">
              <parameter key="0" value="Region.true.polynominal.attribute"/>
              <parameter key="1" value="Market.true.polynominal.attribute"/>
              <parameter key="2" value="Brand.true.polynominal.attribute"/>
              <parameter key="3" value="MLI Class.true.polynominal.attribute"/>
              <parameter key="4" value="MLI Code.true.polynominal.attribute"/>
              <parameter key="5" value="Customer - Key.true.polynominal.attribute"/>
              <parameter key="6" value="Customer-Text.true.polynominal.attribute"/>
              <parameter key="7" value="Month.true.date.attribute"/>
              <parameter key="8" value="Week.true.nominal.attribute"/>
              <parameter key="9" value="Currency.true.polynominal.attribute"/>
              <parameter key="10" value="QTY.true.real.attribute"/>
              <parameter key="11" value="NIV.true.real.attribute"/>
            </list>
            <parameter key="read_not_matching_values_as_missings" value="false"/>
          </operator>
          <operator activated="true" class="read_excel" compatibility="9.0.002" expanded="true" height="68" name="RETAIL" width="90" x="45" y="442">
            <parameter key="excel_file" value="H:\MS&amp;S\Commercial &amp; Customer Retention\Business Development\Volume Opitimisation\Analytics\Data sets\Accessories\MASTER FILE - RETAIL DATA ACCOUNTS.xlsx"/>
            <parameter key="imported_cell_range" value="B1:I10485776"/>
            <list key="annotations"/>
            <parameter key="date_format" value="MMM d, yyyy h:mm:ss a z"/>
            <list key="data_set_meta_data_information">
              <parameter key="0" value="Region.true.polynominal.attribute"/>
              <parameter key="1" value="Market.true.polynominal.attribute"/>
              <parameter key="2" value="Brand.true.polynominal.attribute"/>
              <parameter key="3" value="Retailer Code.false.polynominal.attribute"/>
              <parameter key="4" value="Retailer Code 2.true.polynominal.attribute"/>
              <parameter key="5" value="Retailer Name.false.polynominal.attribute"/>
              <parameter key="6" value="Country.false.polynominal.attribute"/>
              <parameter key="7" value="Retailer Code (N).true.polynominal.attribute"/>
            </list>
            <parameter key="read_not_matching_values_as_missings" value="false"/>
          </operator>
          <operator activated="true" class="multiply" compatibility="9.0.002" expanded="true" height="103" name="Multiply (2)" width="90" x="179" y="442"/>
          <operator activated="true" class="map" compatibility="9.0.002" expanded="true" height="82" name="BRAND_MAP" width="90" x="313" y="595">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="Brand"/>
            <list key="value_mappings"/>
            <parameter key="replace_what" value="JAGUAR BASE MODEL"/>
            <parameter key="replace_by" value="Jaguar"/>
          </operator>
          <operator activated="true" class="map" compatibility="9.0.002" expanded="true" height="82" name="MARKET_MAP" width="90" x="447" y="595">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="Market"/>
            <list key="value_mappings"/>
            <parameter key="replace_what" value="UK DEALER ACCOUNTS"/>
            <parameter key="replace_by" value="UK Retailers"/>
          </operator>
          <operator activated="true" class="concurrency:join" compatibility="9.0.002" expanded="true" height="82" name="Join" width="90" x="782" y="595">
            <parameter key="join_type" value="left"/>
            <list key="key_attributes">
              <parameter key="Brand" value="Brand"/>
              <parameter key="Market" value="Market"/>
              <parameter key="Customer - Key" value="Retailer Code 2"/>
            </list>
          </operator>
          <operator activated="true" class="generate_attributes" compatibility="9.0.002" expanded="true" height="82" name="GEN CATEG (2)" width="90" x="916" y="595">
            <list key="function_descriptions">
              <parameter key="Category" value="if(missing(Brand),&quot;Revenue&quot;,&quot;Revenue&quot;)"/>
            </list>
          </operator>
          <operator activated="true" class="read_csv" compatibility="9.0.002" expanded="true" height="68" name="PART TO MLI" width="90" x="782" y="442">
            <parameter key="csv_file" value="C:\Users\sdima\Desktop\Analytics\Weekly Daily Sales\MAP\PART to MLI to COMPETITIVE - MAPPED.csv"/>
            <parameter key="skip_comments" value="true"/>
            <parameter key="decimal_character" value=","/>
            <parameter key="date_format" value="MMM d, yyyy h:mm:ss a z"/>
            <list key="annotations"/>
            <parameter key="encoding" value="windows-1252"/>
            <list key="data_set_meta_data_information">
              <parameter key="0" value="Part Number.true.polynominal.attribute"/>
              <parameter key="1" value="Part Desc\..true.polynominal.attribute"/>
              <parameter key="2" value="MLI Code.true.polynominal.attribute"/>
              <parameter key="3" value="MLI Class.true.polynominal.attribute"/>
              <parameter key="4" value="MLI Description.true.polynominal.attribute"/>
              <parameter key="5" value="Class Description.true.polynominal.attribute"/>
              <parameter key="6" value="Vehicle Range.true.polynominal.attribute"/>
              <parameter key="7" value="Vehicle Applicability.true.polynominal.attribute"/>
              <parameter key="8" value="Brand.true.polynominal.attribute"/>
              <parameter key="9" value="HIERARCHY BASE.false.polynominal.attribute"/>
              <parameter key="10" value="CLASS.false.polynominal.attribute"/>
              <parameter key="11" value="CLASS DESC.false.polynominal.attribute"/>
              <parameter key="12" value="PRODUCT GROUP DESC.false.polynominal.attribute"/>
              <parameter key="13" value="COMP / Captive.true.polynominal.attribute"/>
            </list>
            <parameter key="read_not_matching_values_as_missings" value="false"/>
          </operator>
          <operator activated="true" class="multiply" compatibility="9.0.002" expanded="true" height="103" name="Multiply" width="90" x="916" y="442"/>
          <operator activated="true" class="select_attributes" compatibility="9.0.002" expanded="true" height="82" name="Select Attributes (4)" width="90" x="1117" y="493">
            <parameter key="attribute_filter_type" value="subset"/>
            <parameter key="attributes" value="MLI Description|MLI Code|MLI Class|Class Description|COMP / Captive|Brand"/>
          </operator>
          <operator activated="true" class="remove_duplicates" compatibility="9.0.002" expanded="true" height="103" name="Remove Duplicates (5)" width="90" x="1251" y="493"/>
          <operator activated="true" class="concurrency:join" compatibility="9.0.002" expanded="true" height="82" name="Join (2)" width="90" x="1318" y="595">
            <parameter key="join_type" value="left"/>
            <list key="key_attributes">
              <parameter key="MLI Code" value="MLI Code"/>
              <parameter key="Brand" value="Brand"/>
            </list>
          </operator>
          <operator activated="true" class="multiply" compatibility="9.0.002" expanded="true" height="82" name="SPLIT (2)" width="90" x="1452" y="595">
            <description align="center" color="transparent" colored="false" width="126">SHOW MLIs NOT MAPED</description>
          </operator>
          <operator activated="false" class="concurrency:join" compatibility="9.0.002" expanded="true" height="82" name="Join (4)" width="90" x="2859" y="391">
            <parameter key="join_type" value="outer"/>
            <list key="key_attributes">
              <parameter key="Brand_Name_Formatted" value="Brand"/>
              <parameter key="Claim_Country_Description" value="Market"/>
              <parameter key="Retailer Code 2" value="Customer - Key"/>
              <parameter key="MLI Code" value="MLI Code"/>
              <parameter key="Accepted_Date_Fix_nominal" value="Week"/>
            </list>
          </operator>
          <operator activated="false" class="select_attributes" compatibility="9.0.002" expanded="true" height="82" name="Select Attributes (2)" width="90" x="1586" y="799">
            <parameter key="attribute_filter_type" value="subset"/>
            <parameter key="attributes" value="Brand|MLI Class|MLI Code|MLI Description|sum(NIV)"/>
          </operator>
          <operator activated="false" class="filter_examples" compatibility="9.0.002" expanded="true" height="103" name="Filter Examples" width="90" x="1720" y="799">
            <list key="filters_list">
              <parameter key="filters_entry_key" value="MLI Description.is_missing."/>
            </list>
          </operator>
          <operator activated="false" class="aggregate" compatibility="9.0.002" expanded="true" height="82" name="Aggregate (2)" width="90" x="1854" y="799">
            <list key="aggregation_attributes">
              <parameter key="sum(NIV)" value="sum"/>
            </list>
            <parameter key="group_by_attributes" value="MLI Description|MLI Code|MLI Class|Brand"/>
          </operator>
          <operator activated="false" class="remove_duplicates" compatibility="9.0.002" expanded="true" height="103" name="Remove Duplicates" width="90" x="1988" y="799">
            <parameter key="attribute_filter_type" value="subset"/>
            <parameter key="attributes" value="MLI Description|MLI Code|MLI Class|Brand"/>
          </operator>
          <operator activated="false" class="write_excel" compatibility="9.0.002" expanded="true" height="82" name="QUALITY_RESULT" width="90" x="2122" y="799">
            <parameter key="excel_file" value="H:\MS&amp;S\Commercial &amp; Customer Retention\Business Development\Volume Opitimisation\Analytics\Data sets\PARTS\Warranty\Quality\2 Revenue MLIs not mapped.xlsx"/>
            <parameter key="date_format" value="dd/MM/yyyy"/>
          </operator>
          <operator activated="false" class="select_attributes" compatibility="9.0.002" expanded="true" height="82" name="Select Attributes (3)" width="90" x="1586" y="85">
            <parameter key="attribute_filter_type" value="subset"/>
            <parameter key="attributes" value="Brand_Name_Formatted|MLI Description|sum(Parts_Cost_GBP_Budget_Rate)|PART_NUMBER"/>
          </operator>
          <operator activated="false" class="filter_examples" compatibility="9.0.002" expanded="true" height="103" name="Filter Examples (2)" width="90" x="1720" y="85">
            <list key="filters_list">
              <parameter key="filters_entry_key" value="MLI Description.is_missing."/>
            </list>
          </operator>
          <operator activated="false" class="aggregate" compatibility="9.0.002" expanded="true" height="82" name="Aggregate (3)" width="90" x="1854" y="85">
            <list key="aggregation_attributes">
              <parameter key="sum(Parts_Cost_GBP_Budget_Rate)" value="sum"/>
            </list>
            <parameter key="group_by_attributes" value="Brand_Name_Formatted|MLI Description|PART_NUMBER"/>
          </operator>
          <operator activated="false" class="remove_duplicates" compatibility="9.0.002" expanded="true" height="103" name="Remove Duplicates (2)" width="90" x="1988" y="85">
            <parameter key="attribute_filter_type" value="subset"/>
            <parameter key="attributes" value="Part Number|MLI Description|Brand_Name_Formatted"/>
          </operator>
          <operator activated="false" class="write_excel" compatibility="9.0.002" expanded="true" height="82" name="QUALITY_RESULT (2)" width="90" x="2122" y="85">
            <parameter key="excel_file" value="H:\MS&amp;S\Commercial &amp; Customer Retention\Business Development\Volume Opitimisation\Analytics\Data sets\PARTS\Warranty\Quality\1.0 Warranty parts not mapped to MLI.xlsx"/>
            <parameter key="date_format" value="dd/MM/yyyy"/>
          </operator>
          <operator activated="true" class="read_csv" compatibility="9.0.002" expanded="true" height="68" name="Read CSV (2)" width="90" x="45" y="238">
            <parameter key="csv_file" value="C:\Users\sdima\Desktop\Analytics\Weekly Daily Sales\WTY\8. UK JAN JAG.csv"/>
            <parameter key="skip_comments" value="true"/>
            <parameter key="date_format" value="MM/dd/yy"/>
            <list key="annotations"/>
            <parameter key="encoding" value="windows-1252"/>
            <list key="data_set_meta_data_information">
              <parameter key="0" value="Brand_Name_Formatted.true.polynominal.attribute"/>
              <parameter key="1" value="Claim_Region_Description.true.polynominal.attribute"/>
              <parameter key="2" value="Claim_Sub_Region_Description.true.polynominal.attribute"/>
              <parameter key="3" value="Claim_Country_Description.true.polynominal.attribute"/>
              <parameter key="4" value="Claim_Retailer_CI_Code.true.polynominal.attribute"/>
              <parameter key="5" value="Claim_Retailer_CI_Code_Description.true.polynominal.attribute"/>
              <parameter key="6" value="Engineering_Part_Number.true.polynominal.attribute"/>
              <parameter key="7" value="Item_Material.true.polynominal.attribute"/>
              <parameter key="8" value="Item_Material_Description.true.polynominal.attribute"/>
              <parameter key="9" value="Accepted_Date_Fix.true.date.attribute"/>
              <parameter key="10" value="Parts_Cost_Local_Currency.true.real.attribute"/>
              <parameter key="11" value="Parts_Cost_GBP_Budget_Rate.true.real.attribute"/>
              <parameter key="12" value="Parts_Quantity.true.real.attribute"/>
            </list>
            <parameter key="read_not_matching_values_as_missings" value="false"/>
            <description align="center" color="transparent" colored="false" width="126">LOCLA WTY&lt;br/&gt;</description>
          </operator>
          <operator activated="true" class="generate_attributes" compatibility="9.0.002" expanded="true" height="82" name="PART_NUMBER" width="90" x="179" y="238">
            <list key="function_descriptions">
              <parameter key="PART_NUMBER" value="if(Brand_Name_Formatted == &quot;Jaguar&quot;,concat(&quot;02&quot;,Item_Material),concat(&quot;28&quot;,Item_Material))"/>
            </list>
          </operator>
          <operator activated="true" class="map" compatibility="9.0.002" expanded="true" height="82" name="BRAND MAP" width="90" x="313" y="238">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="Brand_Name_Formatted"/>
            <list key="value_mappings"/>
            <parameter key="replace_what" value="'Jaguar"/>
            <parameter key="replace_by" value="Jaguar"/>
          </operator>
          <operator activated="true" class="map" compatibility="9.0.002" expanded="true" height="82" name="MARKET MAP" width="90" x="447" y="238">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="Claim_Country_Description"/>
            <list key="value_mappings"/>
            <parameter key="replace_what" value="United Kingdom"/>
            <parameter key="replace_by" value="UK Retailers"/>
          </operator>
          <operator activated="true" class="concurrency:join" compatibility="9.0.002" expanded="true" height="82" name="Join (3)" width="90" x="782" y="238">
            <parameter key="join_type" value="left"/>
            <list key="key_attributes">
              <parameter key="Brand_Name_Formatted" value="Brand"/>
              <parameter key="Claim_Country_Description" value="Market"/>
              <parameter key="Claim_Retailer_CI_Code" value="Retailer Code (N)"/>
            </list>
          </operator>
          <operator activated="true" class="generate_attributes" compatibility="9.0.002" expanded="true" height="82" name="GEN CATEG" width="90" x="916" y="238">
            <list key="function_descriptions">
              <parameter key="Category" value="if(missing(Brand_Name_Formatted),&quot;Warranty&quot;,&quot;Warranty&quot;)"/>
            </list>
          </operator>
          <operator activated="true" class="select_attributes" compatibility="9.0.002" expanded="true" height="82" name="Select Attributes" width="90" x="1117" y="238">
            <parameter key="attribute_filter_type" value="subset"/>
            <parameter key="attributes" value="Accepted_Date_Fix|Brand|Brand_Name_Formatted|Category|Claim_Country_Description|Claim_Region_Description|Claim_Retailer_CI_Code|Claim_Retailer_CI_Code_Description|Claim_Sub_Region_Description|Item_Material|Market|PART_NUMBER|Parts_Cost_GBP_Budget_Rate|Parts_Cost_Local_Currency|Parts_Quantity|Retailer Code (N)|Retailer Code 2|Region"/>
          </operator>
          <operator activated="true" class="concurrency:join" compatibility="9.0.002" expanded="true" height="82" name="JOIN MLI" width="90" x="1318" y="238">
            <parameter key="join_type" value="left"/>
            <list key="key_attributes">
              <parameter key="Brand_Name_Formatted" value="Brand"/>
              <parameter key="PART_NUMBER" value="Part Number"/>
            </list>
            <description align="center" color="transparent" colored="false" width="126">BRING MLI INFO</description>
          </operator>
          <operator activated="true" class="multiply" compatibility="9.0.002" expanded="true" height="82" name="SPLIT" width="90" x="1452" y="238">
            <description align="center" color="transparent" colored="false" width="126">SHOW PARTS NOT MAPED</description>
          </operator>
          <operator activated="true" class="date_to_nominal" compatibility="9.0.002" expanded="true" height="82" name="Date to Nominal" width="90" x="1586" y="238">
            <parameter key="attribute_name" value="Accepted_Date_Fix"/>
            <parameter key="date_format" value="ww.yyyy"/>
            <parameter key="locale" value="English (United Kingdom)"/>
            <parameter key="keep_old_attribute" value="true"/>
          </operator>
          <operator activated="true" class="generate_attributes" compatibility="9.0.002" expanded="true" height="82" name="MTH STR (2)" width="90" x="1720" y="238">
            <list key="function_descriptions">
              <parameter key="MONTH_DATE" value="date_str_custom(Accepted_Date_Fix,&quot;MMM, yyyy&quot;)"/>
            </list>
          </operator>
          <operator activated="true" class="aggregate" compatibility="9.0.002" expanded="true" height="82" name="Aggregate" width="90" x="1921" y="238">
            <list key="aggregation_attributes">
              <parameter key="Parts_Cost_GBP_Budget_Rate" value="sum"/>
            </list>
            <parameter key="group_by_attributes" value="Accepted_Date_Fix_nominal|Brand_Name_Formatted|COMP / Captive|Category|Claim_Country_Description|Claim_Retailer_CI_Code|Claim_Retailer_CI_Code_Description|Class Description|MLI Code|MLI Description|Region"/>
          </operator>
          <operator activated="true" class="remove_duplicates" compatibility="9.0.002" expanded="true" height="103" name="Remove Duplicates (3)" width="90" x="2055" y="238">
            <parameter key="attribute_filter_type" value="subset"/>
            <parameter key="attributes" value="Brand_Name_Formatted|COMP / Captive|Claim_Country_Description|Claim_Retailer_CI_Code|Class Description|MLI Class|MLI Code|Retailer Code 2|MONTH_DATE|Accepted_Date_Fix_nominal"/>
          </operator>
          <operator activated="true" class="rename" compatibility="9.0.002" expanded="true" height="82" name="Rename" width="90" x="2189" y="238">
            <parameter key="old_name" value="Claim_Country_Description"/>
            <parameter key="new_name" value="Market"/>
            <list key="rename_additional_attributes">
              <parameter key="Claim_Retailer_CI_Code" value="Retailer Code (N)"/>
              <parameter key="sum(Parts_Cost_GBP_Budget_Rate)" value="Total"/>
              <parameter key="Accepted_Date_Fix_nominal" value="Week"/>
              <parameter key="Brand_Name_Formatted" value="Brand"/>
              <parameter key="Claim_Retailer_CI_Code_Description" value="Retailer Name"/>
            </list>
          </operator>
          <operator activated="true" class="order_attributes" compatibility="9.0.002" expanded="true" height="82" name="Reorder Attributes" width="90" x="2323" y="238">
            <parameter key="attribute_ordering" value="Region|Market|Brand|Retailer Code (N)|Retailer Name|Category|COMP / Captive|Class Description|MLI Code|MLI Description|Week|Total"/>
          </operator>
          <operator activated="true" class="generate_attributes" compatibility="9.0.002" expanded="true" height="82" name="MTH STR" width="90" x="1720" y="595">
            <list key="function_descriptions">
              <parameter key="MONTH_DATE" value="date_str_custom(Month,&quot;MMM, yyyy&quot;)"/>
            </list>
          </operator>
          <operator activated="true" class="aggregate" compatibility="9.0.002" expanded="true" height="82" name="AGG_MLI" width="90" x="1921" y="595">
            <list key="aggregation_attributes">
              <parameter key="NIV" value="sum"/>
            </list>
            <parameter key="group_by_attributes" value="Brand|COMP / Captive|Category|Class Description|Customer-Text|MLI Code|MLI Description|Market|Region|Retailer Code (N)|Week"/>
            <description align="center" color="transparent" colored="false" width="126">AGGREGATE&lt;br/&gt;MLI LEVEL</description>
          </operator>
          <operator activated="true" class="remove_duplicates" compatibility="9.0.002" expanded="true" height="103" name="Remove Duplicates (4)" width="90" x="2055" y="595">
            <parameter key="attribute_filter_type" value="subset"/>
            <parameter key="attributes" value="Brand|COMP / Captive|Class Description|MLI Code|Market|Week|MLI Class|MONTH_DATE|Customer - Key"/>
          </operator>
          <operator activated="true" class="rename" compatibility="9.0.002" expanded="true" height="82" name="Rename (2)" width="90" x="2189" y="595">
            <parameter key="old_name" value="sum(NIV)"/>
            <parameter key="new_name" value="Total"/>
            <list key="rename_additional_attributes">
              <parameter key="Customer-Text" value="Retailer Name"/>
            </list>
          </operator>
          <operator activated="true" class="order_attributes" compatibility="9.0.002" expanded="true" height="82" name="Reorder Attributes (2)" width="90" x="2323" y="595">
            <parameter key="attribute_ordering" value="Region|Market|Brand|Retailer Name|Category|COMP / Captive|Class Description|MLI Code|MLI Description|Week|Total"/>
          </operator>
          <operator activated="true" class="append" compatibility="9.0.002" expanded="true" height="103" name="Append" width="90" x="2591" y="391"/>
          <connect from_op="WARRANTY GOOGLE" from_port="file" to_op="Read CSV" to_port="file"/>
          <connect from_op="Read CSV" from_port="output" to_op="Write CSV" to_port="input"/>
          <connect from_op="WTY READ" from_port="output" to_op="UK MARKET" to_port="example set input"/>
          <connect from_op="UK MARKET" from_port="example set output" to_op="Write CSV (2)" to_port="input"/>
          <connect from_op="REV READ" from_port="output" to_op="BRAND_MAP" to_port="example set input"/>
          <connect from_op="RETAIL" from_port="output" to_op="Multiply (2)" to_port="input"/>
          <connect from_op="Multiply (2)" from_port="output 1" to_op="Join (3)" to_port="right"/>
          <connect from_op="Multiply (2)" from_port="output 2" to_op="Join" to_port="right"/>
          <connect from_op="BRAND_MAP" from_port="example set output" to_op="MARKET_MAP" to_port="example set input"/>
          <connect from_op="MARKET_MAP" from_port="example set output" to_op="Join" to_port="left"/>
          <connect from_op="Join" from_port="join" to_op="GEN CATEG (2)" to_port="example set input"/>
          <connect from_op="GEN CATEG (2)" from_port="example set output" to_op="Join (2)" to_port="left"/>
          <connect from_op="PART TO MLI" from_port="output" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="JOIN MLI" to_port="right"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Select Attributes (4)" to_port="example set input"/>
          <connect from_op="Select Attributes (4)" from_port="example set output" to_op="Remove Duplicates (5)" to_port="example set input"/>
          <connect from_op="Remove Duplicates (5)" from_port="example set output" to_op="Join (2)" to_port="right"/>
          <connect from_op="Join (2)" from_port="join" to_op="SPLIT (2)" to_port="input"/>
          <connect from_op="SPLIT (2)" from_port="output 1" to_op="MTH STR" to_port="example set input"/>
          <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Filter Examples" to_port="example set input"/>
          <connect from_op="Filter Examples" from_port="example set output" to_op="Aggregate (2)" to_port="example set input"/>
          <connect from_op="Aggregate (2)" from_port="example set output" to_op="Remove Duplicates" to_port="example set input"/>
          <connect from_op="Remove Duplicates" from_port="example set output" to_op="QUALITY_RESULT" to_port="input"/>
          <connect from_op="Select Attributes (3)" from_port="example set output" to_op="Filter Examples (2)" to_port="example set input"/>
          <connect from_op="Filter Examples (2)" from_port="example set output" to_op="Aggregate (3)" to_port="example set input"/>
          <connect from_op="Aggregate (3)" from_port="example set output" to_op="Remove Duplicates (2)" to_port="example set input"/>
          <connect from_op="Remove Duplicates (2)" from_port="example set output" to_op="QUALITY_RESULT (2)" to_port="input"/>
          <connect from_op="Read CSV (2)" from_port="output" to_op="PART_NUMBER" to_port="example set input"/>
          <connect from_op="PART_NUMBER" from_port="example set output" to_op="BRAND MAP" to_port="example set input"/>
          <connect from_op="BRAND MAP" from_port="example set output" to_op="MARKET MAP" to_port="example set input"/>
          <connect from_op="MARKET MAP" from_port="example set output" to_op="Join (3)" to_port="left"/>
          <connect from_op="Join (3)" from_port="join" to_op="GEN CATEG" to_port="example set input"/>
          <connect from_op="GEN CATEG" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
          <connect from_op="Select Attributes" from_port="example set output" to_op="JOIN MLI" to_port="left"/>
          <connect from_op="JOIN MLI" from_port="join" to_op="SPLIT" to_port="input"/>
          <connect from_op="SPLIT" from_port="output 1" to_op="Date to Nominal" to_port="example set input"/>
          <connect from_op="Date to Nominal" from_port="example set output" to_op="MTH STR (2)" to_port="example set input"/>
          <connect from_op="MTH STR (2)" from_port="example set output" to_op="Aggregate" to_port="example set input"/>
          <connect from_op="Aggregate" from_port="example set output" to_op="Remove Duplicates (3)" to_port="example set input"/>
          <connect from_op="Remove Duplicates (3)" from_port="example set output" to_op="Rename" to_port="example set input"/>
          <connect from_op="Rename" from_port="example set output" to_op="Reorder Attributes" to_port="example set input"/>
          <connect from_op="Reorder Attributes" from_port="example set output" to_op="Append" to_port="example set 1"/>
          <connect from_op="MTH STR" from_port="example set output" to_op="AGG_MLI" to_port="example set input"/>
          <connect from_op="AGG_MLI" from_port="example set output" to_op="Remove Duplicates (4)" to_port="example set input"/>
          <connect from_op="Remove Duplicates (4)" from_port="example set output" to_op="Rename (2)" to_port="example set input"/>
          <connect from_op="Rename (2)" from_port="example set output" to_op="Reorder Attributes (2)" to_port="example set input"/>
          <connect from_op="Reorder Attributes (2)" from_port="example set output" to_op="Append" to_port="example set 2"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <description align="center" color="yellow" colored="false" height="104" resized="false" width="180" x="1902" y="484">Here is where I am losing &amp;quot;Retailer Code N&amp;quot; field</description>
        </process>
      </operator>
    </process>

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager
    hi @sdima ok thx for the XML. The CSVs would also help a lot as then I could run your process. Nevertheless I just want to make sure I understand your question. This is the problem?



    because when I look at the parameters, "Retailer Code N" is shown as polynominal, not integer (that's what all the cubes mean):



    Scott
  • NaGorhamNaGorham Member Posts: 1 Contributor I
    edited March 2019
    The mix overseer has a parameter implied as "disregard missings" that is set to bona fide as is normally done. https://arynews.tv/en/pm-complaint-cell-resolves-complaints  If you set it to false by then do you get the effect that you simply anticipate

Sign In or Register to comment.