Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

SOLVED: Simple question about multiple RSS feed

dudesterdudester Member Posts: 15 Maven
edited November 2018 in Help
I notice the Read RSS Feed operator takes no inputs.  

I ask because I am attempting to load a list of rss feeds (URL's) from Excel or a database, loop through the Excel sheet (~80 entries), read each feed & store each feed result into a datasheet, continuing until finished.  This should be a simple looping (Loop Values operator) process, but I can do this only by re-entering in each rss feed URL manually, since the RSS Feed operator permits no operation or parameters passed to it.

Suggestions?

Answers

  • haddockhaddock Member Posts: 849 Maven
    Hi there,

    Though the little beastie takes no inputs it has to take notice of its parameters, and that's where you can slip in your macro, in a loop, like this...
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.003">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.2.003" expanded="true" name="Process">
        <process expanded="true" height="198" width="522">
          <operator activated="true" class="retrieve" compatibility="5.2.003" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
            <parameter key="repository_entry" value="//Samples/data/Iris"/>
          </operator>
          <operator activated="true" class="loop_values" compatibility="5.2.003" expanded="true" height="76" name="Loop Values" width="90" x="179" y="30">
            <parameter key="attribute" value="label"/>
            <process expanded="true" height="216" width="540">
              <operator activated="true" class="web:read_rss" compatibility="5.1.004" expanded="true" height="60" name="Read RSS Feed" width="90" x="112" y="30">
                <parameter key="url" value="http://www.rsssearchhub.com/feeds/?search=%{loop_value}&amp;amp;action=Search+Feeds"/>
                <parameter key="random_user_agent" value="true"/>
              </operator>
              <connect from_op="Read RSS Feed" from_port="output" to_port="out 1"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="sink_out 1" spacing="0"/>
              <portSpacing port="sink_out 2" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Retrieve" from_port="output" to_op="Loop Values" to_port="example set"/>
          <connect from_op="Loop Values" from_port="out 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Hope that's in the right direction anyway.

    Regards
  • dudesterdudester Member Posts: 15 Maven
    Thanks for the lead, Haddock, but I believe that it's probably better to use a feed mashup/aggregator for now.  Yahoo Pipes and others will do this, and then feed to Rapid Miner.

    I was hoping that RM could allow input into the RSS Reader process, but perhaps this will turn out to be a new feature request.  I wish I was better at Java/XML.


    (PS Love your reference to Tintin).
Sign In or Register to comment.