Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
How to aggregate results
I'm processing textual data from several files and sub directories. I need to know how frequent some words/phrases occur and the occurrences density. I got the results as multiple files (IOObjectCollection (Loop Files)), and don't know how to aggregate them for further processing.
My process is:
Loop files
>> Loop Zip-file entries
---->> Read Document > Process Documents (I got example set as output but no word list!!)
----------------------------------------------------------->>Tokenize > Filter Tokens
I tried to link this with Set Role operator for example, but the attribute I'm looking for (the word I'm searching for) doesn't exist in some files and I believe this is why I get Attribute Not found. or maybe i'm missing something here. So, any clue how to aggregate such results?
Tagged:
0
Best Answers
-
sgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Managerhi @Ayoube the output of the Loop Files operator (and many other Loop operators) is a 'collection' of IOObjects. This is indicated by double wires at the output:
For ExampleSets, you can combine using Append IF all the ExampleSets have exactly the same number and type of attributes. Otherwise I would use the "Union with Append" building block:
Scott
6 -
tftemme Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, Member Posts: 164 RM ResearchYou can also use the new Append (SuperSet) operator from the Operator Toolbox extension (since version 2.0). It is capable to append ExampleSets with different Attributes. In addition it also handles if the same attribute role (e.g. label) occurs twice for different attribute names, or if Attributes with the same name have different types.6