"Association rule creating. problem with FP-growth operator"

mzharovmzharov Member Posts: 5 Contributor I
edited June 2019 in Help

Hi! I have a problem with handling Excel file via association rule creating. Please let me know what do I do wrong? The error message I got is "The exampleset contains non-nominal attribute "prod.date" which is not allowed to fp-growth" .". I try to handle data from Excel file but  I have transformed it to .txt in order to attach it.  

<?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
<operator activated="true" class="retrieve" compatibility="8.0.001" expanded="true" height="68" name="Retrieve рабочий файл для проверки (2)" width="90" x="45" y="34">
<parameter key="repository_entry" value="//рабочий файл/первая попытка/рабочий файл для проверки"/>
</operator>
</process>
<?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
<operator activated="true" class="numerical_to_binominal" compatibility="8.0.001" expanded="true" height="82" name="Numerical to Binominal" width="90" x="246" y="34">
<parameter key="attribute_filter_type" value="all"/>
<parameter key="attribute" value=""/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="numeric"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="real"/>
<parameter key="block_type" value="value_series"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_series_end"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
<parameter key="min" value="0.0"/>
<parameter key="max" value="0.0"/>
</operator>
</process>
<?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
<operator activated="true" class="fp_growth" compatibility="8.0.001" expanded="true" height="82" name="FP-Growth" width="90" x="380" y="34">
<parameter key="find_min_number_of_itemsets" value="true"/>
<parameter key="min_number_of_itemsets" value="100"/>
<parameter key="max_number_of_retries" value="15"/>
<parameter key="min_support" value="0.95"/>
<parameter key="max_items" value="-1"/>
<parameter key="keep_example_set" value="false"/>
</operator>
</process>
<?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
<operator activated="true" class="create_association_rules" compatibility="8.0.001" expanded="true" height="82" name="Create Association Rules" width="90" x="514" y="34">
<parameter key="criterion" value="confidence"/>
<parameter key="min_confidence" value="0.8"/>
<parameter key="min_criterion_value" value="0.8"/>
<parameter key="gain_theta" value="2.0"/>
<parameter key="laplace_k" value="1.0"/>
</operator>
</process>

 

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @mzharov can you just use a Select Attributes and remove 'prod.date?' Then see if it works. 

  • mzharovmzharov Member Posts: 5 Contributor I

    Hi Thomas

    I have deleted any dates from my file at all. Ran the same process Retrive-Nominal to Binominal-FP-Growth-Create assoc. rule and got a similar error "The exampleset contains non-nominal attribute "ID" which is not allowed to fp-growth" . ID is just unique number of lines in my file. Could you tell where is my mistake ? Thanks in advance

    screen shot of error and sample of loading file are in attachment 

     

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    I dont open DOCX files as a rule, so you'd have to post a screenshot. Also, your XML is corrupt. You have to post the XML correctly like from this KB article https://community.rapidminer.com/t5/RapidMiner-Studio-Knowledge-Base/How-can-I-share-processes-without-RapidMiner-Server/ta-p/37047

  • mzharovmzharov Member Posts: 5 Contributor I

    Hi Thomas,

    ok, once again . I created new xml and attached correct files. pleace, check . thanks in advance. 

    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <operator activated="true" class="retrieve" compatibility="8.1.001" expanded="true" height="68" name="Retrieve" width="90" x="45" y="85">
    <parameter key="repository_entry" value="//рабочий файл/первая попытка/рабочий файл"/>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <operator activated="true" class="nominal_to_binominal" compatibility="8.1.001" expanded="true" height="103" name="Nominal to Binominal" width="90" x="179" y="85">
    <parameter key="return_preprocessing_model" value="false"/>
    <parameter key="create_view" value="false"/>
    <parameter key="attribute_filter_type" value="all"/>
    <parameter key="attribute" value=""/>
    <parameter key="attributes" value=""/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="nominal"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="file_path"/>
    <parameter key="block_type" value="single_value"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="single_value"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    <parameter key="transform_binominal" value="false"/>
    <parameter key="use_underscore_in_name" value="false"/>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <operator activated="true" class="fp_growth" compatibility="8.1.001" expanded="true" height="82" name="FP-Growth" width="90" x="313" y="85">
    <parameter key="find_min_number_of_itemsets" value="true"/>
    <parameter key="min_number_of_itemsets" value="100"/>
    <parameter key="max_number_of_retries" value="15"/>
    <parameter key="min_support" value="0.95"/>
    <parameter key="max_items" value="-1"/>
    <parameter key="keep_example_set" value="false"/>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <operator activated="true" class="create_association_rules" compatibility="8.1.001" expanded="true" height="82" name="Create Association Rules" width="90" x="447" y="85">
    <parameter key="criterion" value="confidence"/>
    <parameter key="min_confidence" value="0.8"/>
    <parameter key="min_criterion_value" value="0.8"/>
    <parameter key="gain_theta" value="2.0"/>
    <parameter key="laplace_k" value="1.0"/>
    </operator>
    </process>
  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @mzharov the XML is still corrupted. You have to open the XML view and copy it from there. 

     

    Also as a rule, I don't open PPTX files either. Pretty much nothing from MSFT.

  • mzharovmzharov Member Posts: 5 Contributor I

    Hi Thomas,

    actually I do not have an xml option in View-Show panel. I have turned xml at bottom of screen on. So xml from that screen attached. Please, check it. Thanks. Konstantin 

    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.1.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="8.1.001" expanded="true" height="68" name="Retrieve" width="90" x="45" y="85">
    <parameter key="repository_entry" value="//рабочий файл/первая попытка/рабочий файл"/>
    </operator>
    <operator activated="true" class="nominal_to_binominal" compatibility="8.1.001" expanded="true" height="103" name="Nominal to Binominal" width="90" x="179" y="85"/>
    <operator activated="true" class="fp_growth" compatibility="8.1.001" expanded="true" height="82" name="FP-Growth" width="90" x="313" y="85"/>
    <operator activated="true" class="create_association_rules" compatibility="8.1.001" expanded="true" height="82" name="Create Association Rules" width="90" x="447" y="85"/>
    <connect from_op="Retrieve" from_port="output" to_op="Nominal to Binominal" to_port="example set input"/>
    <connect from_op="Nominal to Binominal" from_port="example set output" to_op="FP-Growth" to_port="example set"/>
    <connect from_op="FP-Growth" from_port="frequent sets" to_op="Create Association Rules" to_port="item sets"/>
    <connect from_op="Create Association Rules" from_port="rules" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>
  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @mzharov ok, the XML works now but your data file appears to be rather strange. Is it supposed to look this way, meaning are different data-types mixed into the same attribute column? It could also be my encoding as well. 

     

    2018-03-21_6-29-26.png

     

     

     

  • mzharovmzharov Member Posts: 5 Contributor I

    Hi Thomas,

    I have fixed my file. But anyway I get the similar error . Is there something wrong with data format ?? More info about my file below in the bottom of my post

    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.1.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="8.1.001" expanded="true" height="68" name="Retrieve" width="90" x="112" y="34">
    <parameter key="repository_entry" value="//рабочий файл/рабочий файл для проверки2103"/>
    </operator>
    <operator activated="true" class="nominal_to_binominal" compatibility="8.1.001" expanded="true" height="103" name="Nominal to Binominal" width="90" x="313" y="34"/>
    <operator activated="true" class="fp_growth" compatibility="8.1.001" expanded="true" height="82" name="FP-Growth" width="90" x="447" y="34"/>
    <operator activated="true" class="create_association_rules" compatibility="8.1.001" expanded="true" height="82" name="Create Association Rules" width="90" x="648" y="34"/>
    <connect from_op="Retrieve" from_port="output" to_op="Nominal to Binominal" to_port="example set input"/>
    <connect from_op="Nominal to Binominal" from_port="example set output" to_op="FP-Growth" to_port="example set"/>
    <connect from_op="FP-Growth" from_port="frequent sets" to_op="Create Association Rules" to_port="item sets"/>
    <connect from_op="Create Association Rules" from_port="rules" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

     

    the sample of my loading data below. 

    Row ID     code                  material name  stock          am.per serie   stock date

    1 1.0 1.01010204E8 Амлодипин 78.789 8.789 Wed May 01 00:00:00 MSK 2013
    2 2.0 1.01010204E8 Амлодипин 70.0 25.0 Wed May 01 00:00:00 MSK 2013
    3 3.0 1.01010204E8 Амлодипин 70.0 45.0 Wed May 01 00:00:00 MSK 2013
    4 4.0 1.01010286E8 Акрилжелтый 83.563 83.563 Wed May 01 00:00:00 MSK 2013
    5 5.0 1.00000001E8 АкрилРINK 13.542 13.542 Wed May 01 00:00:00 MSK 2013
    6 6.0 1.01010441E8 Акрил 9.888 9.888 Wed May 01 00:00:00 MSK 2013
    7 7.0 1.01010273E8 SАденозилм 577.251 77.251 Wed May 01 00:00:00 MSK 2013
    8 8.0 1.01010273E8 SАденозилм 500.0 250.0 Wed May 01 00:00:00 MSK 2013
    9 9.0 1.01010273E8 SАденозилм 500.0 250.0 Wed May 01 00:00:00 MSK 2013
  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @mzharov ok, I see what's going on here. You have two attribute columns with real numbers in them. In order to use the FP-Growth operator, you must convert all the data into binmomals (true/false). The conversion is not happening for the numerical values as the Nominal to Binomal operator can't figure out how to transform 70 to true or false.

     

    You would have to figure out how to manipulate the numericals into true/false OR remove them from the data set. 

Sign In or Register to comment.