Clustering dataset having structured and unstructured data

dranammaridranammari Member Posts: 13 Contributor II
edited November 2019 in Help
Hi all,

I want to implement a clustering process in Rapid Miner where I want to cluster a dataset that contains both structured and unstructured data. For example, in my database, I have a table that contains the user age, location, and gender (structured data). In other table, I have the user comments (text, unstructured data). Now I know how to build clustering processes to cluster the data in each of the tables. Of course the two cases are different because in the second case I have to do document clustering, which adds some extra operators, like "data to documents" to transform the table records to documents, and the "Process Documents" to do text-preprocessing. But at the end this is not what I want to do. What I want is to give the clustering operator (say k-means) one dataset that have both the structured and unstructired data, but of course the unstructured part of this data set should be text pre-processed.

Your help is highly appreciated.
Sign In or Register to comment.