ANNOUNCEMENT: WE ARE PROUD TO ANNOUNCE THE LAUNCH OF THE NEW
RAPIDMINER ACADEMY
IT HAS ALL THE SAME TRAINING CONTENT AS HERE PLUS MUCH MORE.
ENJOY AND HAPPY RAPIDMINING!
@sgenzer, Community Manager

Delete Examples with 2 missing attributes

t_liebet_liebe Member Posts: 14 Contributor I
edited December 1 in Help

Hello,

 

I know how to delete missing values of a column in different ways. However, I only want to remove the Examples which have two missing attributes:

 

Size    Item 1     Item 2

1          ?            milk

2         cookie     milk

2           ?              ?

2         cookie     chocolate

2         cookie     crackers

2         cookie     ?

2         cookie     raspberries

 

After that, I would like to combine the two tables to know the percentage of how often cookies and milk occure together and which is the absolute frequency from the occurence of cookie and milk.

How can I use FP-Growth for this?

 

Thank you in advance !

 

 

 

 

 

 

<?xml version="1.0" encoding="UTF-8"?><process version="9.0.002">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="9.0.002" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="read_excel" compatibility="9.0.002" expanded="true" height="68" name="Read Excel" width="90" x="112" y="136">
<parameter key="excel_file" value="\\ADS.DLH.DE\LHuser$\LHT\HAM98\U717465\Documents\02_Data_Mining\01_rapidminer\closed_events_q-star.xlsx"/>
<list key="annotations"/>
<parameter key="date_format" value="MMM d, yyyy h:mm:ss a z"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="Event ID.true.polynominal.attribute"/>
<parameter key="1" value="Event Title.true.polynominal.attribute"/>
<parameter key="2" value="Event Description.true.polynominal.attribute"/>
<parameter key="3" value="Event resp\. dept\..true.polynominal.attribute"/>
<parameter key="4" value="Risk Level.true.polynominal.attribute"/>
<parameter key="5" value="Severity Level.true.polynominal.attribute"/>
<parameter key="6" value="Severity Driver.true.polynominal.attribute"/>
<parameter key="7" value="Closed date event.true.date_time.attribute"/>
<parameter key="8" value="Total Event Time.true.integer.attribute"/>
<parameter key="9" value="Total Investigation Time.true.integer.attribute"/>
<parameter key="10" value="Total Implement\. Time.true.integer.attribute"/>
<parameter key="11" value="Resp\. for coordination.true.polynominal.attribute"/>
<parameter key="12" value="Resp\. for investigation.true.polynominal.attribute"/>
<parameter key="13" value="Source.true.polynominal.attribute"/>
<parameter key="14" value="Event type.true.polynominal.attribute"/>
<parameter key="15" value="Investigation type.true.polynominal.attribute"/>
<parameter key="16" value="Related requirements.true.polynominal.attribute"/>
<parameter key="17" value="CNQ.true.integer.attribute"/>
<parameter key="18" value="A/C Reg.true.polynominal.attribute"/>
<parameter key="19" value="Engine type.true.polynominal.attribute"/>
<parameter key="20" value="PNR.true.polynominal.attribute"/>
<parameter key="21" value="Customer/ Operator.true.polynominal.attribute"/>
<parameter key="22" value="MOR relevant.true.polynominal.attribute"/>
<parameter key="23" value="Repetitive Event.true.polynominal.attribute"/>
<parameter key="24" value="Reason for no or discont\. Investigation.true.polynominal.attribute"/>
<parameter key="25" value="Implemented CA/PA.true.polynominal.attribute"/>
<parameter key="26" value="Implemented Correction.true.polynominal.attribute"/>
<parameter key="27" value="Date of report.true.date_time.attribute"/>
<parameter key="28" value="Coordination closed date.true.date_time.attribute"/>
<parameter key="29" value="Investigation closed date.true.date_time.attribute"/>
</list>
<parameter key="read_not_matching_values_as_missings" value="false"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="9.0.002" expanded="true" height="82" name="Select Attributes (2)" width="90" x="313" y="136">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="Risk Level|Severity Level"/>
</operator>
<connect from_op="Read Excel" from_port="output" to_op="Select Attributes (2)" to_port="example set input"/>
<connect from_op="Select Attributes (2)" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Tagged:

Best Answer

  • rfuentealbarfuentealba Posts: 215   Unicorn
    Solution Accepted

    Hello, @t_liebe,

     

    Use the Filter Examples operator with the following configuration:

     

    Screen Shot 2018-10-08 at 06.32.06.png

    Notice that at the bottom, on your left hand, there is a Match all option. You must select it, as it's an AND operator. Otherwise, that will filter data where the records have one or the other attribute as well.

Answers

  • t_liebet_liebe Member Posts: 14 Contributor I

    Thank you for your quick answer ! :)

Sign In or Register to comment.