RapidMiner

Extracting Emoji from tweets in tiwtter

Contributor

Extracting Emoji from tweets in tiwtter

Extracting Emoji from tweets in twitter

Hello every one .....

I need help or answer aboout if it is poosible to extrcat just emoji from the tweets in twitter which I chose it from the populer hashtages and if it is , I need the tpis please . 

thanks 

4 REPLIES
Community Manager

Re: Extracting Emoji from tweets in tiwtter

[ Edited ]

Cross posting everywhere will not get you the answer sooner. 

 

I will delete the other topics. 

Regards,
Thomas
LinkedIn: Thomas Ott
Blog: Neural Market Trends
Community Manager

Re: Extracting Emoji from tweets in tiwtter

[ Edited ]

You would need to set your encoding to the appropriate type under Preferences. For example UTF-8 will extract a lot of emoticon short codes, i.e. ": )" for Smiley Happy

Regards,
Thomas
LinkedIn: Thomas Ott
Blog: Neural Market Trends
Contributor

Re: Extracting Emoji from tweets in tiwtter

but I dont need spicific code , I am trying to check the using of emoji in tweets so I expect all the kinds of emoji , in this way I should add all the unicode of the emoji ???

thanks 

Community Manager

Re: Extracting Emoji from tweets in tiwtter

[ Edited ]

If you want to do text processing and extract out the emoji's and hashtags, you'll have to transform them into something that won't be destroyed during tokenization.  For example, the smiley emoji is typically represented as ": )" (space and quotes added for clarity). If you use the default tokenization settings, that will be wiped out and you won't be able to extract information from it.  

 

What I typically do is use a few Replace operators to replace the ": )" with "smiley_face" and "#myawesomehashtag" with "hashtag_myawesomehastag."  Then when you tokenize it, it will still remain in the text processing. 

 

<?xml version="1.0" encoding="UTF-8"?><process version="7.3.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.3.001" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="social_media:search_twitter" compatibility="7.3.000" expanded="true" height="68" name="Search Twitter" width="90" x="112" y="34">
        <parameter key="connection" value="ThomasOtt"/>
        <parameter key="query" value="love"/>
        <parameter key="language" value="en"/>
      </operator>
      <operator activated="true" class="replace" compatibility="7.3.001" expanded="true" height="82" name="Replace" width="90" x="246" y="34">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="Text"/>
        <parameter key="replace_what" value="\:\)"/>
        <parameter key="replace_by" value="smiley_face"/>
      </operator>
      <operator activated="true" class="replace" compatibility="7.3.001" expanded="true" height="82" name="Replace (2)" width="90" x="380" y="34">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="Text"/>
        <parameter key="replace_what" value="\#(.*)"/>
        <parameter key="replace_by" value="hashtag_$1"/>
      </operator>
      <connect from_op="Search Twitter" from_port="output" to_op="Replace" to_port="example set input"/>
      <connect from_op="Replace" from_port="example set output" to_op="Replace (2)" to_port="example set input"/>
      <connect from_op="Replace (2)" from_port="example set output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
Regards,
Thomas
LinkedIn: Thomas Ott
Blog: Neural Market Trends