Evaluate fields

jhonmahechajmjhonmahechajm Member Posts: 1 Contributor I
edited November 2018 in Help

Hi.

I am creating a process for calculate number of empty fields of a collection in a mongoDB database. I am currently using Generate Attributes operator with the missing( Attribute_value argument) function in this way:

if(missing(project_code),1,0)+if(missing(id),1,0)+if(missing(ISSN),1,0)+...

I would like to calculate number of empty fields without specifying the columns. How I can do this?

Best regards.

John.

Answers

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,107  RM Data Scientist

    Hi John,

     

    i think Generate Aggregation is doing the trick. See attached process.

     

    Best,

    Martin

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.4.000" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="generate_data_user_specification" compatibility="7.4.000" expanded="true" height="68" name="Generate Data by User Specification" width="90" x="45" y="34">
    <list key="attribute_values">
    <parameter key="missing_a" value="1"/>
    <parameter key="missing_b" value="2"/>
    <parameter key="missing_c" value="3"/>
    </list>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="generate_aggregation" compatibility="7.4.000" expanded="true" height="82" name="Generate Aggregation" width="90" x="246" y="34">
    <parameter key="attribute_name" value="sum_missing"/>
    <parameter key="attribute_filter_type" value="regular_expression"/>
    <parameter key="regular_expression" value="missing.*"/>
    </operator>
    <connect from_op="Generate Data by User Specification" from_port="output" to_op="Generate Aggregation" to_port="example set input"/>
    <connect from_op="Generate Aggregation" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.