Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
[Solved]Process Document from Data
Hi ,
I am retrieving data from DB having column as ID ,TEXT
but when I provide data to Process Document from Data operator , it gives me Row No TEXT and different word counts as columns .
I need ID column as well , so that I can update the database for the same IDs .
Please let me know how to maintain ID column as attribute through out the process.
Regard,
Ankit
I am retrieving data from DB having column as ID ,TEXT
but when I provide data to Process Document from Data operator , it gives me Row No TEXT and different word counts as columns .
I need ID column as well , so that I can update the database for the same IDs .
Please let me know how to maintain ID column as attribute through out the process.
Regard,
Ankit
0
Answers
adding a Set Role operator before process documents and assigning the id role to the ID attribute should do the job.
Best,
Marius
I am reading from a database having id column using Read Database operator, applying o/p of this to SET Role Operator and then output of role operator is applied to Process document in which tokenizing, stemming, filterting and n gram generation are performed.
Still not getting ID column at the output of process document.
Regards,
Ankit
please post your process setup, there must be something wrong. For me, the output of Process Documents definitely still contains the id column. Also, please update both RapidMiner and the Text extension to the latest versions.
Best, Marius
Regards,
Ankit