RapidMiner

Learner I c1579481
Learner I

Extracting Emoji from tweets in tiwtter

Extracting Emoji from tweets in twitter

Hello every one .....

I need help or answer aboout if it is poosible to extrcat just emoji from the tweets in twitter which I chose it from the populer hashtages and if it is , I need the tpis please . 

thanks 

4 REPLIES
Highlighted
RM Certified Expert
RM Certified Expert

Re: Extracting Emoji from tweets in tiwtter

Cross posting everywhere will not get you the answer sooner. 

 

I will delete the other topics. 

RM Certified Expert
RM Certified Expert

Re: Extracting Emoji from tweets in tiwtter

You would need to set your encoding to the appropriate type under Preferences. For example UTF-8 will extract a lot of emoticon short codes, i.e. ": )" for Smiley Happy

Learner I c1579481
Learner I

Re: Extracting Emoji from tweets in tiwtter

but I dont need spicific code , I am trying to check the using of emoji in tweets so I expect all the kinds of emoji , in this way I should add all the unicode of the emoji ???

thanks 

RM Certified Expert
RM Certified Expert

Re: Extracting Emoji from tweets in tiwtter

If you want to do text processing and extract out the emoji's and hashtags, you'll have to transform them into something that won't be destroyed during tokenization.  For example, the smiley emoji is typically represented as ": )" (space and quotes added for clarity). If you use the default tokenization settings, that will be wiped out and you won't be able to extract information from it.  

 

What I typically do is use a few Replace operators to replace the ": )" with "smiley_face" and "#myawesomehashtag" with "hashtag_myawesomehastag."  Then when you tokenize it, it will still remain in the text processing. 

 

<?xml version="1.0" encoding="UTF-8"?><process version="7.3.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.3.001" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="social_media:search_twitter" compatibility="7.3.000" expanded="true" height="68" name="Search Twitter" width="90" x="112" y="34">
        <parameter key="connection" value="ThomasOtt"/>
        <parameter key="query" value="love"/>
        <parameter key="language" value="en"/>
      </operator>
      <operator activated="true" class="replace" compatibility="7.3.001" expanded="true" height="82" name="Replace" width="90" x="246" y="34">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="Text"/>
        <parameter key="replace_what" value="\:\)"/>
        <parameter key="replace_by" value="smiley_face"/>
      </operator>
      <operator activated="true" class="replace" compatibility="7.3.001" expanded="true" height="82" name="Replace (2)" width="90" x="380" y="34">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="Text"/>
        <parameter key="replace_what" value="\#(.*)"/>
        <parameter key="replace_by" value="hashtag_$1"/>
      </operator>
      <connect from_op="Search Twitter" from_port="output" to_op="Replace" to_port="example set input"/>
      <connect from_op="Replace" from_port="example set output" to_op="Replace (2)" to_port="example set input"/>
      <connect from_op="Replace (2)" from_port="example set output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
Polls
How can RapidMiner increase participation in our new competitions?
Twitter Feed