RapidMiner

HELP obtaining Twitter user details (I'm new to RapidMiner)

Contributor

HELP obtaining Twitter user details (I'm new to RapidMiner)

Hello helpful people! I am trying to use the "Get Twitter user details" operator in order to get the following for each ID in my Twitter search:

- location

- number of followers

- number of friends

- number of favorites

- number of tweets, etc.

 

I see that the "Get Twitter user details" operator will get me results for one Twitter ID at a time. However, I have 5,000 IDs that I need the above information for. Is there a way to obtain this simulanteously? Or perhaps using another operator? THANK YOU! Smiley Happy

See more topics labeled with:

1 REPLY
RMStaff

Re: HELP obtaining Twitter user details (I'm new to RapidMiner)

Hi Elibart,

 

your question got me interested and I think that you need to use the Loop Values operator in combination with the Get Twitter User Details operator. Here is a simple process showing what I meant:

 

<?xml version="1.0" encoding="UTF-8"?><process version="7.5.001">
  <operator activated="true" class="read_csv" compatibility="7.5.001" expanded="true" height="68" name="Read CSV" width="90" x="179" y="34">
    <parameter key="csv_file" value="C:\Users\SebastianGolbert\Documents\Twitter\names.txt"/>
    <parameter key="column_separators" value=";"/>
    <parameter key="trim_lines" value="false"/>
    <parameter key="use_quotes" value="true"/>
    <parameter key="quotes_character" value="&quot;"/>
    <parameter key="escape_character" value="\"/>
    <parameter key="skip_comments" value="false"/>
    <parameter key="comment_characters" value="#"/>
    <parameter key="parse_numbers" value="true"/>
    <parameter key="decimal_character" value="."/>
    <parameter key="grouped_digits" value="false"/>
    <parameter key="grouping_character" value=","/>
    <parameter key="date_format" value=""/>
    <parameter key="first_row_as_names" value="false"/>
    <list key="annotations"/>
    <parameter key="time_zone" value="SYSTEM"/>
    <parameter key="locale" value="English (United States)"/>
    <parameter key="encoding" value="windows-1252"/>
    <list key="data_set_meta_data_information"/>
    <parameter key="read_not_matching_values_as_missings" value="true"/>
    <parameter key="datamanagement" value="double_array"/>
    <parameter key="data_management" value="auto"/>
  </operator>
</process>
<?xml version="1.0" encoding="UTF-8"?><process version="7.5.001">
  <operator activated="true" class="concurrency:loop_values" compatibility="7.5.001" expanded="true" height="82" name="Loop Values" width="90" x="380" y="34">
    <parameter key="attribute" value="att1"/>
    <parameter key="iteration_macro" value="loop_value"/>
    <parameter key="reuse_results" value="false"/>
    <parameter key="enable_parallel_execution" value="true"/>
    <process expanded="true">
      <operator activated="true" class="social_media:get_twitter_user_details" compatibility="7.3.000" expanded="true" height="68" name="Get Twitter User Details" width="90" x="380" y="34">
        <parameter key="connection" value="Twitter"/>
        <parameter key="query_type" value="name"/>
        <parameter key="user" value="%{loop_value}"/>
      </operator>
      <connect from_op="Get Twitter User Details" from_port="output" to_port="output 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="source_input 2" spacing="0"/>
      <portSpacing port="sink_output 1" spacing="0"/>
      <portSpacing port="sink_output 2" spacing="0"/>
    </process>
  </operator>
</process>
<?xml version="1.0" encoding="UTF-8"?><process version="7.5.001">
  <operator activated="true" class="append" compatibility="7.5.001" expanded="true" height="82" name="Append" width="90" x="648" y="34">
    <parameter key="datamanagement" value="double_array"/>
    <parameter key="data_management" value="auto"/>
    <parameter key="merge_type" value="all"/>
  </operator>
</process>

Please try it out and give us a feedback about the running time, the part about appending all the collections could be quite inneficient.

 

Best regards,

SebaG