Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

How do I access/use attribute values?

mgstickmgstick Member Posts: 5 Contributor II
edited November 2018 in Help
Hi,

I'm trying to process a collection of news articles retrieved from an rss feed. As part of the processing I need to save the content of each article to a file with a unique file name for each article. Since I'm using the Read RSS Feed operator there isn't a built in "write pages into files" parameter (like there is in the Crawl Web operator) to accomplish this save operation. So, I'm trying to use the Write Document operator embedded in a Process Documents sub-process to write the individual article files to disk.

The problem I'm having is that I don't know how to provide a unique value to the "file" parameter of the Write Document operator so each Document in my Document Collection gets saved to disk using a unique file name i.e. [Document ID].txt.

Is there a way to specify the value of an attribute to be used as part of the value in an operator's parameter? Or is there a better way to write a collection of documents to disk where each document gets saved in a separate file?

Thanks,
Michael

Answers

  • SebastianLohSebastianLoh Member Posts: 99 Contributor II
    Hi Michael,

    the Extract Macro and Loop Examples operators are your friends. You can loop over the example set, then use the Extract Macro operator inside the loop to extract the value of an attribute (eg."Filename") into a macro (eg. name it "filename") and then write the document with Write Document to the location C:\Docs\%{filename}_%{example}.txt

    The second macro value ( %{example} ), which is used in the filename parameter is the continuous number of the loop iteration. So you'll get

    Fottball_1.txt
    Basball_2.txt
    ...

    At the output of the loop examples oeprator you might have to use a aggregate opperator to keep going with the modified example set.

    Search also for "macro" and "loop" in the myExperiment view to see some examples for macros and loops. If you like, post your solution on myExperiment to share it with other users

    P.S. with the Generate Macro operator you can even construct more complex macro values

    Ciao Sebastian
Sign In or Register to comment.