Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Looping through a filter and save every iteration
Hello,
I already searched the forum but the given processes seem to be outdated and not compatible with the most recent version of RM.
I have a big dataset with sales-data of many products, many supermarkets, many weeks....
I would like to loop thorugh a filter in a way that for every value of the column "UPC" (=Universal Product Code) a new exampleset is stored.
Is that possible?
My process looks like this:
<div><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="retrieve" compatibility="9.5.000" expanded="true" height="68" name="Retrieve FINALJOIN" width="90" x="112" y="34"><br> <parameter key="repository_entry" value="../../data/Join/FINALJOIN"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="select_attributes" compatibility="9.5.000" expanded="true" height="82" name="Select Attributes" width="90" x="246" y="34"><br> <parameter key="attribute_filter_type" value="subset"/><br> <parameter key="attribute" value=""/><br> <parameter key="attributes" value="IRI_KEY|L1|L2|BRAND|UPC|WEEK|PRICE_PER_UNIT"/><br> <parameter key="use_except_expression" value="false"/><br> <parameter key="value_type" value="attribute_value"/><br> <parameter key="use_value_type_exception" value="false"/><br> <parameter key="except_value_type" value="time"/><br> <parameter key="block_type" value="attribute_block"/><br> <parameter key="use_block_type_exception" value="false"/><br> <parameter key="except_block_type" value="value_matrix_row_start"/><br> <parameter key="invert_selection" value="false"/><br> <parameter key="include_special_attributes" value="false"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="generate_attributes" compatibility="9.5.000" expanded="true" height="82" name="Generate Attributes" width="90" x="380" y="34"><br> <list key="function_descriptions"><br> <parameter key="IRI2" value=""Markt"+IRI_KEY"/><br> </list><br> <parameter key="keep_all" value="true"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="filter_examples" compatibility="9.5.000" expanded="true" height="103" name="Ketchup" width="90" x="514" y="34"><br> <parameter key="parameter_expression" value=""/><br> <parameter key="condition_class" value="custom_filters"/><br> <parameter key="invert_filter" value="false"/><br> <list key="filters_list"><br> <parameter key="filters_entry_key" value="L2.equals.KETCHUP"/><br> </list><br> <parameter key="filters_logic_and" value="true"/><br> <parameter key="filters_check_metadata" value="true"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="filter_examples" compatibility="9.5.000" expanded="true" height="103" name="UPC" width="90" x="648" y="34"><br> <parameter key="parameter_expression" value=""/><br> <parameter key="condition_class" value="custom_filters"/><br> <parameter key="invert_filter" value="false"/><br> <list key="filters_list"><br> <parameter key="filters_entry_key" value="UPC.equals.00-01-13000-00121"/><br> </list><br> <parameter key="filters_logic_and" value="true"/><br> <parameter key="filters_check_metadata" value="true"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="retrieve" compatibility="9.5.000" expanded="true" height="68" name="Retrieve A1" width="90" x="112" y="238"><br> <parameter key="repository_entry" value="../../data/prei_vorwoche/A1"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="format_numbers" compatibility="9.5.000" expanded="true" height="82" name="Format Numbers" width="90" x="246" y="238"><br> <parameter key="attribute_filter_type" value="single"/><br> <parameter key="attribute" value="PRICE_PER_UNIT"/><br> <parameter key="attributes" value=""/><br> <parameter key="use_except_expression" value="false"/><br> <parameter key="value_type" value="numeric"/><br> <parameter key="use_value_type_exception" value="false"/><br> <parameter key="except_value_type" value="real"/><br> <parameter key="block_type" value="value_series"/><br> <parameter key="use_block_type_exception" value="false"/><br> <parameter key="except_block_type" value="value_series_end"/><br> <parameter key="invert_selection" value="false"/><br> <parameter key="include_special_attributes" value="false"/><br> <parameter key="format_type" value="currency"/><br> <parameter key="locale" value="English (United States)"/><br> <parameter key="use_grouping" value="false"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="write_csv" compatibility="9.5.000" expanded="true" height="82" name="Write CSV" width="90" x="380" y="238"><br> <parameter key="csv_file" value="D:\Master_Arbeit\A1.csv"/><br> <parameter key="column_separator" value=";"/><br> <parameter key="write_attribute_names" value="true"/><br> <parameter key="quote_nominal_values" value="true"/><br> <parameter key="format_date_attributes" value="true"/><br> <parameter key="append_to_file" value="false"/><br> <parameter key="encoding" value="SYSTEM"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="read_csv" compatibility="9.5.000" expanded="true" height="68" name="Read CSV" width="90" x="45" y="340"><br> <parameter key="csv_file" value="D:\Master_Arbeit\A1.csv"/><br> <parameter key="column_separators" value=";"/><br> <parameter key="trim_lines" value="false"/><br> <parameter key="use_quotes" value="true"/><br> <parameter key="quotes_character" value="""/><br> <parameter key="escape_character" value="\"/><br> <parameter key="skip_comments" value="false"/><br> <parameter key="comment_characters" value="#"/><br> <parameter key="starting_row" value="1"/><br> <parameter key="parse_numbers" value="true"/><br> <parameter key="decimal_character" value="."/><br> <parameter key="grouped_digits" value="false"/><br> <parameter key="grouping_character" value=","/><br> <parameter key="infinity_representation" value=""/><br> <parameter key="date_format" value=""/><br> <parameter key="first_row_as_names" value="true"/><br> <list key="annotations"/><br> <parameter key="time_zone" value="SYSTEM"/><br> <parameter key="locale" value="English (United States)"/><br> <parameter key="encoding" value="SYSTEM"/><br> <parameter key="read_all_values_as_polynominal" value="false"/><br> <list key="data_set_meta_data_information"/><br> <parameter key="read_not_matching_values_as_missings" value="true"/><br> <parameter key="datamanagement" value="double_array"/><br> <parameter key="data_management" value="auto"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="store" compatibility="9.5.000" expanded="true" height="68" name="Store (2)" width="90" x="179" y="340"><br> <parameter key="repository_entry" value="../../data/prei_vorwoche/A2"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="retrieve" compatibility="9.5.000" expanded="true" height="68" name="Retrieve" width="90" x="313" y="340"><br> <parameter key="repository_entry" value="../../data/prei_vorwoche/A2"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="filter_examples" compatibility="9.5.000" expanded="true" height="103" name="MARKTFILTER" width="90" x="447" y="340"><br> <parameter key="parameter_expression" value=""/><br> <parameter key="condition_class" value="custom_filters"/><br> <parameter key="invert_filter" value="false"/><br> <list key="filters_list"><br> <parameter key="filters_entry_key" value="IRI2.equals.Markt1078045"/><br> </list><br> <parameter key="filters_logic_and" value="true"/><br> <parameter key="filters_check_metadata" value="true"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="store" compatibility="9.5.000" expanded="true" height="68" name="Store (3)" width="90" x="581" y="340"><br> <parameter key="repository_entry" value="../../data/prei_vorwoche/Dollar"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="time_series:lag_series" compatibility="9.5.000" expanded="true" height="82" name="Lag" width="90" x="112" y="442"><br> <list key="attributes"><br> <parameter key="PRICE_PER_UNIT" value="1"/><br> </list><br> <parameter key="overwrite_attributes" value="false"/><br> <parameter key="extend_exampleset" value="false"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="generate_attributes" compatibility="9.5.000" expanded="true" height="82" name="Generate Attributes (4)" width="90" x="246" y="493"><br> <list key="function_descriptions"><br> <parameter key="UP" value="cut(PRICE_PER_UNIT,1,4)"/><br> <parameter key="UP-1" value="cut([PRICE_PER_UNIT-1],1,4)"/><br> </list><br> <parameter key="keep_all" value="true"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="store" compatibility="9.5.000" expanded="true" height="68" name="Store (5)" width="90" x="380" y="493"><br> <parameter key="repository_entry" value="A3"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="write_excel" compatibility="9.5.000" expanded="true" height="103" name="Write Excel" width="90" x="514" y="493"><br> <parameter key="excel_file" value="D:\Master_Arbeit\A3.xlsx"/><br> <parameter key="file_format" value="xlsx"/><br> <enumeration key="sheet_names"/><br> <parameter key="sheet_name" value="RapidMiner Data"/><br> <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/><br> <parameter key="number_format" value="#.0"/><br> <parameter key="encoding" value="SYSTEM"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="read_excel" compatibility="9.5.000" expanded="true" height="68" name="Read Excel" width="90" x="112" y="595"><br> <parameter key="excel_file" value="D:\Master_Arbeit\A3.xlsx"/><br> <parameter key="sheet_selection" value="sheet number"/><br> <parameter key="sheet_number" value="1"/><br> <parameter key="imported_cell_range" value="A1"/><br> <parameter key="encoding" value="SYSTEM"/><br> <parameter key="first_row_as_names" value="true"/><br> <list key="annotations"/><br> <parameter key="date_format" value=""/><br> <parameter key="time_zone" value="SYSTEM"/><br> <parameter key="locale" value="English (United States)"/><br> <parameter key="read_all_values_as_polynominal" value="false"/><br> <list key="data_set_meta_data_information"/><br> <parameter key="read_not_matching_values_as_missings" value="true"/><br> <parameter key="datamanagement" value="double_array"/><br> <parameter key="data_management" value="auto"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="generate_attributes" compatibility="9.5.000" expanded="true" height="82" name="Generate Attributes (3)" width="90" x="313" y="595"><br> <list key="function_descriptions"><br> <parameter key="DeltaPrice" value="[UP-1]-UP"/><br> </list><br> <parameter key="keep_all" value="true"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="store" compatibility="9.5.000" expanded="true" height="68" name="Store (4)" width="90" x="447" y="595"><br> <parameter key="repository_entry" value="../../data/prei_vorwoche/dollar_lag"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="sort" compatibility="9.5.000" expanded="true" height="82" name="Sort by Week" width="90" x="581" y="187"><br> <parameter key="attribute_name" value="WEEK"/><br> <parameter key="sorting_direction" value="increasing"/><br> </operator><br></process><br><?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"><br> <operator activated="true" class="store" compatibility="9.5.000" expanded="true" height="68" name="Store" width="90" x="715" y="187"><br> <parameter key="repository_entry" value="../../data/prei_vorwoche/A1"/><br> </operator><br></process><br><br></div>
Tagged:
0
Best Answers
-
lionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 UnicornHi @MPB_,
"I would like to loop thorugh a filter in a way that for every attribute of the column "UPC" (=Universal Product Code) a new exampleset is stored."
Your request is not clear :
You mean every value of the column "UPC" ?
Moreover you XML process is broken, I can not import it in RapidMiner. Can you please export your process via File -> Export Process
and share the file here in the community ?
Can you also share your data ?
Regards,
Lionel
5 -
tftemme Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, Member Posts: 164 RM ResearchHi @MPB_
Just to add on this. If Lionel is right and you want to loop over every value of the column "UPC", you can use the operator Group Into Collection from the Operator Toolbox Extension (install it over the marketplace). This creates a collection with individual ExampleSets for each value of the column "UPC". You can use Loop Collection to iterator over the collection and store each ExampleSet individually.
Best regards
Fabian
2
Answers
thank you so much for helping me. In the meantime I changed my process a little and with your suggestions, now everything works fine.
Thank you again and have a very nice week,
Best regards