The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.
Options

How do I access/use attribute values?

mgstickmgstick Member Posts: 5 Contributor II
edited November 2018 in Help
Hi,

I'm trying to process a collection of news articles retrieved from an rss feed. As part of the processing I need to save the content of each article to a file with a unique file name for each article. Since I'm using the Read RSS Feed operator there isn't a built in "write pages into files" parameter (like there is in the Crawl Web operator) to accomplish this save operation. So, I'm trying to use the Write Document operator embedded in a Process Documents sub-process to write the individual article files to disk.

The problem I'm having is that I don't know how to provide a unique value to the "file" parameter of the Write Document operator so each Document in my Document Collection gets saved to disk using a unique file name i.e. [Document ID].txt.

Is there a way to specify the value of an attribute to be used as part of the value in an operator's parameter? Or is there a better way to write a collection of documents to disk where each document gets saved in a separate file?

Thanks,
Michael

Answers

  • Options
    SebastianLohSebastianLoh Member Posts: 99 Contributor II
    Hi Michael,

    the Extract Macro and Loop Examples operators are your friends. You can loop over the example set, then use the Extract Macro operator inside the loop to extract the value of an attribute (eg."Filename") into a macro (eg. name it "filename") and then write the document with Write Document to the location C:\Docs\%{filename}_%{example}.txt

    The second macro value ( %{example} ), which is used in the filename parameter is the continuous number of the loop iteration. So you'll get

    Fottball_1.txt
    Basball_2.txt
    ...

    At the output of the loop examples oeprator you might have to use a aggregate opperator to keep going with the modified example set.

    Search also for "macro" and "loop" in the myExperiment view to see some examples for macros and loops. If you like, post your solution on myExperiment to share it with other users

    P.S. with the Generate Macro operator you can even construct more complex macro values

    Ciao Sebastian
Sign In or Register to comment.