RapidMiner 9.7 is Now Available

Lots of amazing new improvements including true version control! Learn more about what's new here.

CLICK HERE TO DOWNLOAD

Custom polarity options in Aylien sentiment analysis

m_okem_oke Member Posts: 11 Contributor I
edited December 2018 in Help

Is it possible to have custom polarity options in Aylien/RapidMiner's sentiment analysis, beyond the usual "positive, neutral, negative"?

 

Custom examples will be options like happy, sad, mild feeling etc...

Best Answer

  • sgenzersgenzer 12Posts: 2,926  Community Manager
    Solution Accepted

    outside of RapidMiner?  Why would I do that?  :)

     

    Here you go:

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.6.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" breakpoints="after" class="text:read_document" compatibility="7.5.000" expanded="true" height="68" name="Read Document" width="90" x="45" y="34">
    <parameter key="file" value="/Users/genzerconsulting/Desktop/Scott's comments from 2006.txt"/>
    <description align="center" color="transparent" colored="false" width="126">get Scott's Comments from 2006 (Extraction)</description>
    </operator>
    <operator activated="true" breakpoints="after" class="subprocess" compatibility="7.6.001" expanded="true" height="82" name="Subprocess" width="90" x="179" y="34">
    <process expanded="true">
    <operator activated="true" class="text:documents_to_data" compatibility="7.5.000" expanded="true" height="82" name="Documents to Data" width="90" x="45" y="34">
    <parameter key="text_attribute" value="comment"/>
    <parameter key="add_meta_information" value="false"/>
    </operator>
    <operator activated="true" class="split" compatibility="7.6.001" expanded="true" height="82" name="Split" width="90" x="179" y="34">
    <parameter key="split_pattern" value="\n"/>
    </operator>
    <operator activated="true" class="transpose" compatibility="7.6.001" expanded="true" height="82" name="Transpose" width="90" x="313" y="34"/>
    <operator activated="true" class="rename" compatibility="7.6.001" expanded="true" height="82" name="Rename" width="90" x="447" y="34">
    <parameter key="old_name" value="att_1"/>
    <parameter key="new_name" value="comment"/>
    <list key="rename_additional_attributes"/>
    </operator>
    <operator activated="true" class="trim" compatibility="7.6.001" expanded="true" height="82" name="Trim" width="90" x="581" y="34"/>
    <operator activated="true" class="filter_examples" compatibility="7.6.001" expanded="true" height="103" name="Filter Examples" width="90" x="715" y="34">
    <parameter key="parameter_expression" value="matches(prefix(comment,1),&quot;[A-Z]&quot;)"/>
    <list key="filters_list">
    <parameter key="filters_entry_key" value="comment.is_not_missing."/>
    </list>
    </operator>
    <operator activated="true" class="subprocess" compatibility="7.6.001" expanded="true" height="82" name="Subprocess (2)" width="90" x="849" y="34">
    <process expanded="true">
    <operator activated="true" class="generate_id" compatibility="7.6.001" expanded="true" height="82" name="Generate ID" width="90" x="45" y="34">
    <parameter key="create_nominal_ids" value="true"/>
    </operator>
    <operator activated="true" class="replace" compatibility="7.6.001" expanded="true" height="82" name="Replace" width="90" x="179" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="id"/>
    <parameter key="include_special_attributes" value="true"/>
    <parameter key="replace_what" value="id_"/>
    </operator>
    <connect from_port="in 1" to_op="Generate ID" to_port="example set input"/>
    <connect from_op="Generate ID" from_port="example set output" to_op="Replace" to_port="example set input"/>
    <connect from_op="Replace" from_port="example set output" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="source_in 2" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">create IDs</description>
    </operator>
    <connect from_port="in 1" to_op="Documents to Data" to_port="documents 1"/>
    <connect from_op="Documents to Data" from_port="example set" to_op="Split" to_port="example set input"/>
    <connect from_op="Split" from_port="example set output" to_op="Transpose" to_port="example set input"/>
    <connect from_op="Transpose" from_port="example set output" to_op="Rename" to_port="example set input"/>
    <connect from_op="Rename" from_port="example set output" to_op="Trim" to_port="example set input"/>
    <connect from_op="Trim" from_port="example set output" to_op="Filter Examples" to_port="example set input"/>
    <connect from_op="Filter Examples" from_port="example set output" to_op="Subprocess (2)" to_port="in 1"/>
    <connect from_op="Subprocess (2)" from_port="out 1" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="source_in 2" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">clean them up (Transfer and Load)</description>
    </operator>
    <operator activated="true" class="filter_example_range" compatibility="7.6.001" expanded="true" height="82" name="Filter Example Range" width="90" x="313" y="34">
    <parameter key="first_example" value="1"/>
    <parameter key="last_example" value="3"/>
    </operator>
    <operator activated="true" breakpoints="after" class="subprocess" compatibility="7.6.001" expanded="true" height="82" name="Subprocess (8)" width="90" x="447" y="34">
    <process expanded="true">
    <operator activated="true" class="web:enrich_data_by_webservice" compatibility="7.3.000" expanded="true" height="68" name="Enrich Data by Webservice (9)" width="90" x="179" y="34">
    <parameter key="query_type" value="Indexed"/>
    <list key="string_machting_queries"/>
    <list key="regular_expression_queries">
    <parameter key="result" value=".*"/>
    </list>
    <list key="regular_region_queries">
    <parameter key="foo" value="document_tone.[-}][-}]"/>
    </list>
    <list key="xpath_queries"/>
    <list key="namespaces"/>
    <parameter key="assume_html" value="false"/>
    <list key="index_queries">
    <parameter key="result" value="0.999999"/>
    </list>
    <list key="jsonpath_queries">
    <parameter key="AngerScore" value="$.tones[?(@.tone_id=='anger')]"/>
    </list>
    <parameter key="request_method" value="POST"/>
    <parameter key="service_method" value="foo"/>
    <parameter key="body" value="body=&lt;%comment%&gt;"/>
    <parameter key="url" value="https://gateway.watsonplatform.net/tone-analyzer/api/v3/tone?version=2016-05-19"/>
    <list key="request_properties">
    <parameter key="username" value="username"/>
    <parameter key="password" value="password"/>
    <parameter key="Content-Type" value="text/plain"/>
    <parameter key="sentences" value="false"/>
    </list>
    <description align="center" color="transparent" colored="false" width="126">tone with POST</description>
    </operator>
    <operator activated="true" class="generate_attributes" compatibility="7.6.001" expanded="true" height="82" name="Generate Attributes (10)" width="90" x="313" y="34">
    <list key="function_descriptions">
    <parameter key="result" value="concat(result,&quot;}&quot;)"/>
    </list>
    <description align="center" color="transparent" colored="false" width="126">add stupid } to the end of the extract</description>
    </operator>
    <operator activated="true" class="subprocess" compatibility="7.6.001" expanded="true" height="82" name="Subprocess (12)" width="90" x="447" y="34">
    <process expanded="true">
    <operator activated="true" class="multiply" compatibility="7.6.001" expanded="true" height="103" name="Multiply (2)" width="90" x="45" y="34"/>
    <operator activated="true" class="loop_examples" compatibility="7.6.001" expanded="true" height="103" name="Loop Examples (4)" width="90" x="313" y="34">
    <process expanded="true">
    <operator activated="true" class="filter_example_range" compatibility="7.6.001" expanded="true" height="82" name="Filter Example Range (2)" width="90" x="45" y="34">
    <parameter key="first_example" value="%{example}"/>
    <parameter key="last_example" value="%{example}"/>
    </operator>
    <operator activated="true" class="text:data_to_documents" compatibility="7.5.000" expanded="true" height="68" name="Data to Documents (3)" width="90" x="380" y="34">
    <parameter key="select_attributes_and_weights" value="true"/>
    <list key="specify_weights">
    <parameter key="result" value="1.0"/>
    </list>
    </operator>
    <operator activated="true" class="text:combine_documents" compatibility="7.5.000" expanded="true" height="82" name="Combine Documents (3)" width="90" x="514" y="34"/>
    <operator activated="true" class="text:extract_information" compatibility="7.5.000" expanded="true" height="68" name="Extract Information (4)" width="90" x="648" y="34">
    <parameter key="query_type" value="JsonPath"/>
    <list key="string_machting_queries"/>
    <list key="regular_expression_queries"/>
    <list key="regular_region_queries"/>
    <list key="xpath_queries"/>
    <list key="namespaces"/>
    <list key="index_queries"/>
    <list key="jsonpath_queries">
    <parameter key="AngerScore" value="$.document_tone.tone_categories[0].tones[0].*"/>
    <parameter key="DisgustScore" value="$.document_tone.tone_categories[0].tones[1].*"/>
    <parameter key="FearScore" value="$.document_tone.tone_categories[0].tones[2].*"/>
    <parameter key="JoyScore" value="$.document_tone.tone_categories[0].tones[3].*"/>
    <parameter key="SadnessScore" value="$.document_tone.tone_categories[0].tones[4].*"/>
    <parameter key="AnalyticScore" value="$.document_tone.tone_categories[1].tones[0].*"/>
    <parameter key="ConfidenceScore" value="$.document_tone.tone_categories[1].tones[1].*"/>
    <parameter key="TentativeScore" value="$.document_tone.tone_categories[1].tones[2].*"/>
    <parameter key="OpennessScore" value="$.document_tone.tone_categories[2].tones[0].*"/>
    <parameter key="ConscientiousnessScore" value="$.document_tone.tone_categories[2].tones[1].*"/>
    <parameter key="ExtraversionScore" value="$.document_tone.tone_categories[2].tones[2].*"/>
    <parameter key="AgreeablenessScore" value="$.document_tone.tone_categories[2].tones[3].*"/>
    <parameter key="EmotionalRangeScore" value="$.document_tone.tone_categories[2].tones[4].*"/>
    </list>
    </operator>
    <operator activated="true" class="text:documents_to_data" compatibility="7.5.000" expanded="true" height="82" name="Documents to Data (4)" width="90" x="782" y="34">
    <parameter key="text_attribute" value="foo2"/>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes (4)" width="90" x="916" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="foo2"/>
    <parameter key="invert_selection" value="true"/>
    </operator>
    <connect from_port="example set" to_op="Filter Example Range (2)" to_port="example set input"/>
    <connect from_op="Filter Example Range (2)" from_port="example set output" to_op="Data to Documents (3)" to_port="example set"/>
    <connect from_op="Data to Documents (3)" from_port="documents" to_op="Combine Documents (3)" to_port="documents 1"/>
    <connect from_op="Combine Documents (3)" from_port="document" to_op="Extract Information (4)" to_port="document"/>
    <connect from_op="Extract Information (4)" from_port="document" to_op="Documents to Data (4)" to_port="documents 1"/>
    <connect from_op="Documents to Data (4)" from_port="example set" to_op="Select Attributes (4)" to_port="example set input"/>
    <connect from_op="Select Attributes (4)" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_example set" spacing="0"/>
    <portSpacing port="sink_example set" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="append" compatibility="7.6.001" expanded="true" height="82" name="Append (4)" width="90" x="447" y="34"/>
    <operator activated="true" class="generate_id" compatibility="7.6.001" expanded="true" height="82" name="Generate ID (4)" width="90" x="313" y="238"/>
    <operator activated="true" class="generate_id" compatibility="7.6.001" expanded="true" height="82" name="Generate ID (5)" width="90" x="581" y="34"/>
    <operator activated="true" class="join" compatibility="7.6.001" expanded="true" height="82" name="Join (3)" width="90" x="782" y="187">
    <list key="key_attributes"/>
    </operator>
    <connect from_port="in 1" to_op="Multiply (2)" to_port="input"/>
    <connect from_op="Multiply (2)" from_port="output 1" to_op="Loop Examples (4)" to_port="example set"/>
    <connect from_op="Multiply (2)" from_port="output 2" to_op="Generate ID (4)" to_port="example set input"/>
    <connect from_op="Loop Examples (4)" from_port="output 1" to_op="Append (4)" to_port="example set 1"/>
    <connect from_op="Append (4)" from_port="merged set" to_op="Generate ID (5)" to_port="example set input"/>
    <connect from_op="Generate ID (4)" from_port="example set output" to_op="Join (3)" to_port="left"/>
    <connect from_op="Generate ID (5)" from_port="example set output" to_op="Join (3)" to_port="right"/>
    <connect from_op="Join (3)" from_port="join" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="source_in 2" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">process Json data</description>
    </operator>
    <operator activated="true" class="subprocess" compatibility="7.6.001" expanded="true" height="82" name="Subprocess (13)" width="90" x="581" y="34">
    <process expanded="true">
    <operator activated="true" class="parse_numbers" compatibility="7.6.001" expanded="true" height="82" name="Parse Numbers (4)" width="90" x="45" y="34">
    <parameter key="attribute_filter_type" value="regular_expression"/>
    <parameter key="regular_expression" value=".*Score"/>
    <description align="center" color="transparent" colored="false" width="126">convert Scores to numerical</description>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes (7)" width="90" x="246" y="34">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attribute" value="id"/>
    <parameter key="attributes" value="result|id"/>
    <parameter key="invert_selection" value="true"/>
    <parameter key="include_special_attributes" value="true"/>
    </operator>
    <connect from_port="in 1" to_op="Parse Numbers (4)" to_port="example set input"/>
    <connect from_op="Parse Numbers (4)" from_port="example set output" to_op="Select Attributes (7)" to_port="example set input"/>
    <connect from_op="Select Attributes (7)" from_port="example set output" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="source_in 2" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">clean up</description>
    </operator>
    <connect from_port="in 1" to_op="Enrich Data by Webservice (9)" to_port="Example Set"/>
    <connect from_op="Enrich Data by Webservice (9)" from_port="ExampleSet" to_op="Generate Attributes (10)" to_port="example set input"/>
    <connect from_op="Generate Attributes (10)" from_port="example set output" to_op="Subprocess (12)" to_port="in 1"/>
    <connect from_op="Subprocess (12)" from_port="out 1" to_op="Subprocess (13)" to_port="in 1"/>
    <connect from_op="Subprocess (13)" from_port="out 1" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="source_in 2" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">IBM Watson - get comment &amp;quot;tone&amp;quot; scores</description>
    </operator>
    <operator activated="true" class="order_attributes" compatibility="7.6.001" expanded="true" height="82" name="Reorder Attributes" width="90" x="581" y="34">
    <parameter key="attribute_ordering" value="comment"/>
    </operator>
    <connect from_op="Read Document" from_port="output" to_op="Subprocess" to_port="in 1"/>
    <connect from_op="Subprocess" from_port="out 1" to_op="Filter Example Range" to_port="example set input"/>
    <connect from_op="Filter Example Range" from_port="example set output" to_op="Subprocess (8)" to_port="in 1"/>
    <connect from_op="Subprocess (8)" from_port="out 1" to_op="Reorder Attributes" to_port="example set input"/>
    <connect from_op="Reorder Attributes" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

Answers

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,408   Unicorn

    I doubt it, as this is a third party tool designed for use for lots of different people with different use cases.

    You could either remap the existing polarities returned, or contact Aylien about developing a custom version for you.  Either that, or try building your own sentiment models in RapidMiner.

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • sgenzersgenzer 12Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,926  Community Manager

    Hi @m_oke - I would look into IBM Watson's Tone Analyzer API.  I use it all the time and will give you what you are looking for.


    Scott

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,408   Unicorn

    Nice, @sgenzer --do you have a sample RM process for that?  Or are you using it outside of RM?

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,408   Unicorn

    Nice, I will check it out!

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
    sgenzer
  • m_okem_oke Member Posts: 11 Contributor I

    I will definitely check this out and feedback, thank you @sgenzer

  • m_okem_oke Member Posts: 11 Contributor I

    @sgenzer I see that the document you used for the project is on your desktop.

    If there are no privacy issues, do you mind making the document publicly avaliable?

  • sgenzersgenzer 12Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,926  Community Manager

    @m_oke haha.  There are no privacy issues as long as you don't kill yourself laughing too much.  The doc is a anonymously-cleansed version of report card comments I gave my students back in 2006.  :)  They are attached.

     

    Scott

     

    m_oke
Sign In or Register to comment.