Options

Opinion Mining

sudheendrasudheendra Member Posts: 22 Maven
Hai,

My Objective is to extract the opinion about a particular subject/fact(For Example- Scope of Data Mining) from social medias like twitter, facebook. How can I proceed with the same using RM. Any one can provide the operator names whch is suitable for my objective

Thanks,
Sudheendra

Answers

  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    this is possible with the operators of the Text Plugin of RapidMiner 4.x or with the combination of the Text Processing Extension and the still unpublished Web Extension of RapidMiner 5. The latter will give you easy access to all these pages, but we still working on it.

    Greetings,
      Sebastian
  • Options
    sudheendrasudheendra Member Posts: 22 Maven
    Hai Sebastain,

    I have tried with crawler operator with RM 4.6 version. I would like to get the opinion about global warming from facebook.whenever I am trying for that I am getting an error message "ArrayIndexOutOfBoundsException". I am attaching the process below.Can you suggest how to proceed further

    <operator name="Root" class="Process" expanded="yes">
        <operator name="Crawler" class="Crawler">
            <parameter key="url" value="http://www.facebook.com"/>
            <list key="crawling_rules">
              <parameter key="visit_url" value="http://www.facebook.com/search/?q=global+warming&amp;init=quick#/topic.php?uid=20009925629&amp;topic=13945"/>
              <parameter key="visit_content" value="global warming"/>
            </list>
            <parameter key="output_dir" value="C:\Documents and Settings\ADMIN\Desktop\qq"/>
            <parameter key="extension" value="html"/>
        </operator>
    </operator>


    Thanks,
    Sudheendra
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    the old crawler had many problems, these error messages where one, the intraceability the second. So I can't really say why this hapens. I would guess it's because you don't have a single website crawled, because your rules don't apply correct. It's a little bit tricky how to do this.
    The visit_url is no regular expression and applies only to the starting site. This should be put to URL parameter instead.

    Greetings,
      Sebastian
  • Options
    JepseJepse Member Posts: 11 Contributor II
    Hi!!

    @Sebastian: when will be the Web Extension released? Is there any beta available?

    Brgds Jepse
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    unfortunately not, but it will be published soon.

    Greetings,
      Sebastian
Sign In or Register to comment.