RapidMiner

Confused about the Query Expression blocks on the Extract Information Operator

Wisdom logo Registration now open for RapidMiner Wisdom Americas | New Orleans | October 10-12, 2018   Learn More
Newbie sgtcrom7
Newbie

Confused about the Query Expression blocks on the Extract Information Operator

I'm trying to extract word counts from a block of text. I have the Create Document Operator (where I have pasted my text) linked to the Extract Information Operator. I have the words I want tot extract (terrorist and civilian) entered into the attribute name blocks, what should I be putting in the query expression blocks? Thanks.

5 REPLIES

Re: Confused about the Query Expression blocks on the Extract Information Operator

Hi @sgtcrom7,

 

Can this process meet your needs? 

 

<?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="8.1.001" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="text:create_document" compatibility="8.1.000" expanded="true" height="68" name="Create Document" width="90" x="112" y="85">
        <parameter key="text" value="My taylor is rich. He's a terrorist, but before he was a civilian."/>
      </operator>
      <operator activated="true" class="text:process_documents" compatibility="8.1.000" expanded="true" height="103" name="Process Documents" width="90" x="313" y="85">
        <parameter key="vector_creation" value="Term Occurrences"/>
        <process expanded="true">
          <operator activated="true" class="text:tokenize" compatibility="8.1.000" expanded="true" height="68" name="Tokenize" width="90" x="313" y="34"/>
          <connect from_port="document" to_op="Tokenize" to_port="document"/>
          <connect from_op="Tokenize" from_port="document" to_port="document 1"/>
          <portSpacing port="source_document" spacing="0"/>
          <portSpacing port="sink_document 1" spacing="0"/>
          <portSpacing port="sink_document 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="select_attributes" compatibility="8.1.001" expanded="true" height="82" name="Select Attributes" width="90" x="514" y="85">
        <parameter key="attribute_filter_type" value="subset"/>
        <parameter key="attributes" value="civilian|terrorist"/>
        <parameter key="regular_expression" value="civilian"/>
      </operator>
      <connect from_op="Create Document" from_port="output" to_op="Process Documents" to_port="documents 1"/>
      <connect from_op="Process Documents" from_port="example set" to_op="Select Attributes" to_port="example set input"/>
      <connect from_op="Process Documents" from_port="word list" to_port="result 2"/>
      <connect from_op="Select Attributes" from_port="example set output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>

Regards,

 

Lionel

 

 

Newbie sgtcrom7
Newbie

Re: Confused about the Query Expression blocks on the Extract Information Operator

Lionel, 

Thanks for the reply. I'm not sure if this solves my problem or not. I just downloaded the free version of RapidMiner today, and I am only using the drag and drop functions. So I'm actually not sure where I would enter code like that. Sorry if I wasted your time. I really do appreciate you trying to help.

I did eventually end up figuring out how to get word counts out of my documents however, so maybe as I keep playing with the software I can learn enough to ask better questions. What I need to do is actually very simple, I'm just trying to figure out how to make it less labor intensive. Again, thanks.  

Re: Confused about the Query Expression blocks on the Extract Information Operator

@sgtcrom7,

 

A / To use the code I provide, you have to follow these steps (this is the method to share a process between RapidMiner users) : 

1.Activation of the XML panel : 

Date_A_B.png

 

2. Copy and paste the XML code I provided in the XML panel

Date_A_B_2.png

3. Click on the "check button"

Date_A_B_3.png

4. Normally, the process appears in the process window....

 

B/ To learn the basics of RapidMiner, I encourage you to start by following :

 - the tutorials (menu Help)

 - the training videos (menu Help

Tutorials.png

 

I hope it helps,

 

Regards,

 

Lionel

Newbie sgtcrom7
Newbie

Re: Confused about the Query Expression blocks on the Extract Information Operator

The code didn't run, but thanks for explaining that to me. I'm making progress! 

Re: Confused about the Query Expression blocks on the Extract Information Operator

Hi @sgtcrom7,

 

Glad you make progress....but....


"The code didn't run...."

 

Here some hypothesis : 

 - did you fully copy the code I provided and/or

 - did you clear the existing code before copying my code in the XML panel ? here the instructions : 

 

Tutorial_1.png

 

"never give up...." : 

 

Don't hesitate to reply if it does'nt work.

 

Regards,

 

Lionel