ANNOUNCEMENT: WE ARE PROUD TO ANNOUNCE THE LAUNCH OF THE NEW
RAPIDMINER ACADEMY
IT HAS ALL THE SAME TRAINING CONTENT AS HERE PLUS MUCH MORE.
ENJOY AND HAPPY RAPIDMINING!
@sgenzer, Community Manager

Data Appearing as Rows Instead of Attributes (Columns)

blake_galbreathblake_galbreath Member Posts: 1 Learner I
edited November 10 in Help

Hello,

 

I am trying to get 2 entities from a website using Xpath:

//h:h2[(@class='uvIdeaTitle')]/h:a/text()

//h:div[(@class='uvIdeaVoteCount')]/h:strong/text()

 

I get all of the correct data, but they appear as sequential rows, instead of separate columns under the Results tab.

 

I am using the following process:

Read Excel > Get Pages > Data to Documents > Process Documents (Cut Document).

 

How can I retrieve the data in the following structure:

URL -- Idea Title -- Vote Count

instead of 

URL -- Idea Title

URL -- Vote Count

 

Thanks,

Blake

 

 

Tagged:

Answers

  • yyhuangyyhuang Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 122  RM Data Scientist

    Hi @blake_galbreath 

     

    You will need pivot operator. 

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.1.001" expanded="true" name="Process">
    <parameter key="process_duration_for_mail" value="1"/>
    <parameter key="encoding" value="UTF-8"/>
    <process expanded="true">
    <operator activated="true" class="generate_data_user_specification" compatibility="8.1.001" expanded="true" height="68" name="Generate Data by User Specification" width="90" x="313" y="34">
    <list key="attribute_values">
    <parameter key="URL" value="&quot;rapidminer.com&quot;"/>
    <parameter key="value" value="&quot;marketing&quot;"/>
    <parameter key="att_name" value="&quot;idea_tittle&quot;"/>
    </list>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="generate_data_user_specification" compatibility="8.1.001" expanded="true" height="68" name="Generate Data by User Specification (2)" width="90" x="313" y="136">
    <list key="attribute_values">
    <parameter key="URL" value="&quot;rapidminer.com&quot;"/>
    <parameter key="value" value="&quot;4&quot;"/>
    <parameter key="att_name" value="&quot;vote_count&quot;"/>
    </list>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="append" compatibility="8.1.001" expanded="true" height="103" name="Append" width="90" x="447" y="34"/>
    <operator activated="true" breakpoints="before" class="pivot" compatibility="8.1.001" expanded="true" height="82" name="Pivot" width="90" x="581" y="34">
    <parameter key="group_attribute" value="URL"/>
    <parameter key="index_attribute" value="att_name"/>
    <parameter key="consider_weights" value="false"/>
    </operator>
    <connect from_op="Generate Data by User Specification" from_port="output" to_op="Append" to_port="example set 1"/>
    <connect from_op="Generate Data by User Specification (2)" from_port="output" to_op="Append" to_port="example set 2"/>
    <connect from_op="Append" from_port="merged set" to_op="Pivot" to_port="example set input"/>
    <connect from_op="Pivot" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

     Cheers,

    YY

    sgenzerdang
Sign In or Register to comment.