Options

"Concatenate all text from Twitter feed"

robinrobin Member Posts: 100 Guru
edited June 2019 in Help

I am collecting all of the text from a Twitter feed using the Twitter operator. I am storing this data into a MySQL database. I extract this information from the MySQL table to process through a sentiment analysis engine, as part of the process I need to include the all of the examples into an API format before submitting. I use the Generate Attribute operator to create the API secret and key required, however when I join and then concatinate the date into a single file (Select Attribute used to drop the attributes I do not need) I only have the first tweet that was pulled from the DB included in the API submission file, the rest of the tweets are missing.

 

What am I doing incorrectly? I have tried turnig the data into documents and combing, I have tried creating collections and flattening them. I have really tried everything but I am just unable to insert the roughly 1,5k tweets that I have pulled down into the file format. 

 


curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/hal+json' --header 'X-API-SECRET-KEY:XXXXXXXxXXXxXXXXXX' --header 'X-API-XXXXXXXXXXXXX' -d '{
"name": "account_name",
"gender": 0,
"content": {
"content_handle": "account_handle",
"content_source": 1,2
"content_date": "10/04/2018 22:00:46 PM SAST",
"language_content": "need to insert the free text examples from twitter in here."
},
"person_handle": "key_account_manager"

 

In the code above I need to insert the Twitter Text into the "language_content" field, but can only ever insert a single line of Twitter data. 

 

@RitualGym Hey guys, the gym here in Illovo was supposed to be opening in April, is it still happening? I’m keen to get stared. 💪🏼 @RitualGym Hey guys, the gym here in Illovo was supposed to be opening in April, is it still happening? I’m keen to get stared. 💪🏼
@OneDayOnlycoza House of Chards 🥦 https://t.co/ljZQD93t3i @OneDayOnlycoza House of Chards 🥦 https://t.co/ljZQD93t3i
@ThatDarnKitteh @Nick_Frost If this isn’t your handle by the end of the day I’m unfollowing 😂 @ThatDarnKitteh @Nick_Frost If this isn’t your handle by the end of the day I’m unfollowing 😂
@Nick_Frost Have you seen what happens when people try combine their names for their kids? 😣😖🤢🤮 @Nick_Frost Have you seen what happens when people try combine their names for their kids? 😣😖🤢🤮
This is the best thing on the internet right now. 😂 https://t.co/TchhxFUGqT This is the best thing on the internet right now. 😂 https://t.co/TchhxFUGqT
@OneDayOnlycoza UK size 7. 😉 https://t.co/DAIaK0V5oF @OneDayOnlycoza UK size 7. 😉 https://t.co/DAIaK0V5oF

Above is the text that I need to insert into the file. This text comes from a MySQL db, so has /r/n characters seperating the fileds (which I believe is causing the issue)

 

At my wits end, please help me see the wood for the trees. 

Best Answer

  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
    Solution Accepted

    @robin I'm running out the door ATM, but this almost gets you there. You'll need to loop over the combined document to add in the header stuff.

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.1.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="8.1.001" expanded="true" height="68" name="Retrieve RobinJSON" width="90" x="112" y="34">
    <parameter key="repository_entry" value="//Community Answers/data/RobinJSON"/>
    </operator>
    <operator activated="true" class="extract_macro" compatibility="8.1.001" expanded="true" height="68" name="Extract Macro" width="90" x="246" y="34">
    <parameter key="macro" value="Num"/>
    <list key="additional_macros"/>
    </operator>
    <operator activated="true" class="concurrency:loop" compatibility="8.1.001" expanded="true" height="82" name="Loop" width="90" x="380" y="34">
    <parameter key="number_of_iterations" value="%{Num}"/>
    <parameter key="enable_parallel_execution" value="false"/>
    <process expanded="true">
    <operator activated="true" class="extract_macro" compatibility="8.1.001" expanded="true" height="68" name="Extract Macro (2)" width="90" x="112" y="34">
    <parameter key="macro" value="insert"/>
    <parameter key="macro_type" value="data_value"/>
    <parameter key="attribute_name" value="Test"/>
    <parameter key="example_index" value="%{iteration}"/>
    <list key="additional_macros"/>
    </operator>
    <operator activated="true" class="text:create_document" compatibility="8.1.000" expanded="true" height="68" name="Create Document" width="90" x="246" y="34">
    <parameter key="text" value="&quot;language_content&quot;: &quot;%{insert}&quot; &#10; }, &#10;"/>
    </operator>
    <operator activated="false" class="text:documents_to_data" compatibility="8.1.000" expanded="true" height="68" name="Documents to Data" width="90" x="313" y="187">
    <parameter key="text_attribute" value="yumyum"/>
    </operator>
    <connect from_port="input 1" to_op="Extract Macro (2)" to_port="example set"/>
    <connect from_op="Create Document" from_port="output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="text:combine_documents" compatibility="8.1.000" expanded="true" height="82" name="Combine Documents" width="90" x="514" y="34"/>
    <connect from_op="Retrieve RobinJSON" from_port="output" to_op="Extract Macro" to_port="example set"/>
    <connect from_op="Extract Macro" from_port="example set" to_op="Loop" to_port="input 1"/>
    <connect from_op="Loop" from_port="output 1" to_op="Combine Documents" to_port="documents 1"/>
    <connect from_op="Combine Documents" from_port="document" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

Answers

  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @robin do you have a Loop operator iterate over this?

  • Options
    robinrobin Member Posts: 100 Guru

    Yes I did. I copletely ignored the macro I set up at the begining of the process to perform the loop and started doing some crazy things in the end. 

     

    Note to self a macro can be used inside Generate Document.

     

    Thank you Thomas

Sign In or Register to comment.