Delete Examples with 2 missing attributes

t_liebet_liebe Member Posts: 14 Contributor II
edited December 2018 in Help



I know how to delete missing values of a column in different ways. However, I only want to remove the Examples which have two missing attributes:


Size    Item 1     Item 2

1          ?            milk

2         cookie     milk

2           ?              ?

2         cookie     chocolate

2         cookie     crackers

2         cookie     ?

2         cookie     raspberries


After that, I would like to combine the two tables to know the percentage of how often cookies and milk occure together and which is the absolute frequency from the occurence of cookie and milk.

How can I use FP-Growth for this?


Thank you in advance !







<?xml version="1.0" encoding="UTF-8"?><process version="9.0.002">
<operator activated="true" class="process" compatibility="9.0.002" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="read_excel" compatibility="9.0.002" expanded="true" height="68" name="Read Excel" width="90" x="112" y="136">
<parameter key="excel_file" value="\\ADS.DLH.DE\LHuser$\LHT\HAM98\U717465\Documents\02_Data_Mining\01_rapidminer\closed_events_q-star.xlsx"/>
<list key="annotations"/>
<parameter key="date_format" value="MMM d, yyyy h:mm:ss a z"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="Event ID.true.polynominal.attribute"/>
<parameter key="1" value="Event Title.true.polynominal.attribute"/>
<parameter key="2" value="Event Description.true.polynominal.attribute"/>
<parameter key="3" value="Event resp\. dept\..true.polynominal.attribute"/>
<parameter key="4" value="Risk Level.true.polynominal.attribute"/>
<parameter key="5" value="Severity Level.true.polynominal.attribute"/>
<parameter key="6" value="Severity Driver.true.polynominal.attribute"/>
<parameter key="7" value="Closed date event.true.date_time.attribute"/>
<parameter key="8" value="Total Event Time.true.integer.attribute"/>
<parameter key="9" value="Total Investigation Time.true.integer.attribute"/>
<parameter key="10" value="Total Implement\. Time.true.integer.attribute"/>
<parameter key="11" value="Resp\. for coordination.true.polynominal.attribute"/>
<parameter key="12" value="Resp\. for investigation.true.polynominal.attribute"/>
<parameter key="13" value="Source.true.polynominal.attribute"/>
<parameter key="14" value="Event type.true.polynominal.attribute"/>
<parameter key="15" value="Investigation type.true.polynominal.attribute"/>
<parameter key="16" value="Related requirements.true.polynominal.attribute"/>
<parameter key="17" value="CNQ.true.integer.attribute"/>
<parameter key="18" value="A/C Reg.true.polynominal.attribute"/>
<parameter key="19" value="Engine type.true.polynominal.attribute"/>
<parameter key="20" value="PNR.true.polynominal.attribute"/>
<parameter key="21" value="Customer/ Operator.true.polynominal.attribute"/>
<parameter key="22" value="MOR relevant.true.polynominal.attribute"/>
<parameter key="23" value="Repetitive Event.true.polynominal.attribute"/>
<parameter key="24" value="Reason for no or discont\. Investigation.true.polynominal.attribute"/>
<parameter key="25" value="Implemented CA/PA.true.polynominal.attribute"/>
<parameter key="26" value="Implemented Correction.true.polynominal.attribute"/>
<parameter key="27" value="Date of report.true.date_time.attribute"/>
<parameter key="28" value="Coordination closed date.true.date_time.attribute"/>
<parameter key="29" value="Investigation closed date.true.date_time.attribute"/>
<parameter key="read_not_matching_values_as_missings" value="false"/>
<operator activated="true" class="select_attributes" compatibility="9.0.002" expanded="true" height="82" name="Select Attributes (2)" width="90" x="313" y="136">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="Risk Level|Severity Level"/>
<connect from_op="Read Excel" from_port="output" to_op="Select Attributes (2)" to_port="example set input"/>
<connect from_op="Select Attributes (2)" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>

Best Answer

  • Options
    rfuentealbarfuentealba Moderator, RapidMiner Certified Analyst, Member, University Professor Posts: 568 Unicorn
    Solution Accepted

    Hello, @t_liebe,


    Use the Filter Examples operator with the following configuration:


    Screen Shot 2018-10-08 at 06.32.06.png

    Notice that at the bottom, on your left hand, there is a Match all option. You must select it, as it's an AND operator. Otherwise, that will filter data where the records have one or the other attribute as well.


  • Options
    t_liebet_liebe Member Posts: 14 Contributor II

    Thank you for your quick answer ! :)

Sign In or Register to comment.