Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
"Help with Word List Operator"
I am trying to add the WordList Operator to this Word Vector code I am working on. I cannot enable it properly. I would appreciate any suggestions on implementing the WordList Operator. I wanted to add it to the end so I can get a list of each word with a count.
Thanks,
Ron McEwan
Thanks,
Ron McEwan
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.0.11" expanded="true" name="Process">
<process expanded="true" height="431" width="413">
<operator activated="true" class="web:get_webpage" compatibility="5.0.4" expanded="true" height="60" name="Get Page" width="90" x="55" y="46">
<parameter key="url" value="http://seekingalpha.com/news/market_currents?source=refreshed"/>
<list key="query_parameters"/>
</operator>
<operator activated="true" class="text:tokenize" compatibility="5.0.7" expanded="true" height="60" name="Tokenize" width="90" x="202" y="41"/>
<operator activated="true" class="text:extract_length" compatibility="5.0.7" expanded="true" height="60" name="Extract Length" width="90" x="112" y="165"/>
<operator activated="true" class="text:extract_token_number" compatibility="5.0.7" expanded="true" height="60" name="Extract Token Number" width="90" x="246" y="165"/>
<connect from_op="Get Page" from_port="output" to_op="Tokenize" to_port="document"/>
<connect from_op="Tokenize" from_port="document" to_op="Extract Length" to_port="document"/>
<connect from_op="Extract Length" from_port="document" to_op="Extract Token Number" to_port="document"/>
<connect from_op="Extract Token Number" from_port="document" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Tagged:
0
Answers
it seems you didn't create a word vector so far. You can use the "Process Documents" operator to simply do this. If you only need the term occurences it's a very simple extension of your example code: Regards,
Matthias