ANNOUNCEMENT: WE ARE PROUD TO ANNOUNCE THE LAUNCH OF THE NEW
RAPIDMINER ACADEMY
IT HAS ALL THE SAME TRAINING CONTENT AS HERE PLUS MUCH MORE.
ENJOY AND HAPPY RAPIDMINING!
@sgenzer, Community Manager

Confused about the Query Expression blocks on the Extract Information Operator

sgtcrom7sgtcrom7 Member Posts: 3 Contributor I
edited November 10 in Help

I'm trying to extract word counts from a block of text. I have the Create Document Operator (where I have pasted my text) linked to the Extract Information Operator. I have the words I want tot extract (terrorist and civilian) entered into the attribute name blocks, what should I be putting in the query expression blocks? Thanks.

Tagged:

Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 575   Unicorn

    Hi @sgtcrom7,

     

    Can this process meet your needs? 

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.1.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="text:create_document" compatibility="8.1.000" expanded="true" height="68" name="Create Document" width="90" x="112" y="85">
    <parameter key="text" value="My taylor is rich. He's a terrorist, but before he was a civilian."/>
    </operator>
    <operator activated="true" class="text:process_documents" compatibility="8.1.000" expanded="true" height="103" name="Process Documents" width="90" x="313" y="85">
    <parameter key="vector_creation" value="Term Occurrences"/>
    <process expanded="true">
    <operator activated="true" class="text:tokenize" compatibility="8.1.000" expanded="true" height="68" name="Tokenize" width="90" x="313" y="34"/>
    <connect from_port="document" to_op="Tokenize" to_port="document"/>
    <connect from_op="Tokenize" from_port="document" to_port="document 1"/>
    <portSpacing port="source_document" spacing="0"/>
    <portSpacing port="sink_document 1" spacing="0"/>
    <portSpacing port="sink_document 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="8.1.001" expanded="true" height="82" name="Select Attributes" width="90" x="514" y="85">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attributes" value="civilian|terrorist"/>
    <parameter key="regular_expression" value="civilian"/>
    </operator>
    <connect from_op="Create Document" from_port="output" to_op="Process Documents" to_port="documents 1"/>
    <connect from_op="Process Documents" from_port="example set" to_op="Select Attributes" to_port="example set input"/>
    <connect from_op="Process Documents" from_port="word list" to_port="result 2"/>
    <connect from_op="Select Attributes" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    </process>
    </operator>
    </process>

    Regards,

     

    Lionel

     

     

    sgenzer
  • sgtcrom7sgtcrom7 Member Posts: 3 Contributor I

    Lionel, 

    Thanks for the reply. I'm not sure if this solves my problem or not. I just downloaded the free version of RapidMiner today, and I am only using the drag and drop functions. So I'm actually not sure where I would enter code like that. Sorry if I wasted your time. I really do appreciate you trying to help.

    I did eventually end up figuring out how to get word counts out of my documents however, so maybe as I keep playing with the software I can learn enough to ask better questions. What I need to do is actually very simple, I'm just trying to figure out how to make it less labor intensive. Again, thanks.  

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 575   Unicorn

    @sgtcrom7,

     

    A / To use the code I provide, you have to follow these steps (this is the method to share a process between RapidMiner users) : 

    1.Activation of the XML panel : 

    Date_A_B.png

     

    2. Copy and paste the XML code I provided in the XML panel

    Date_A_B_2.png

    3. Click on the "check button"

    Date_A_B_3.png

    4. Normally, the process appears in the process window....

     

    B/ To learn the basics of RapidMiner, I encourage you to start by following :

     - the tutorials (menu Help)

     - the training videos (menu Help

    Tutorials.png

     

    I hope it helps,

     

    Regards,

     

    Lionel

    sgenzer
  • sgtcrom7sgtcrom7 Member Posts: 3 Contributor I

    The code didn't run, but thanks for explaining that to me. I'm making progress! 

    sgenzer
  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 575   Unicorn

    Hi @sgtcrom7,

     

    Glad you make progress....but....


    "The code didn't run...."

     

    Here some hypothesis : 

     - did you fully copy the code I provided and/or

     - did you clear the existing code before copying my code in the XML panel ? here the instructions : 

     

    Tutorial_1.png

     

    "never give up...." : 

     

    Don't hesitate to reply if it does'nt work.

     

    Regards,

     

    Lionel

     

     

    sgenzer
Sign In or Register to comment.