How to structure data for cluster analysis

Lottew44Lottew44 Member Posts: 2 Learner I
edited April 2020 in Help
Hi everyone! 

I am an MBA student and I would like to cluster companies based on certain text files of their websites to see which are most similar but I don’t know how to structure the data? 

Would it best to copy those texts in excel cells (1 cell per text)? Or how do I do this? I want to be able to tokenize and stem the text later on and to use a TF IDF generation.

I also couldn’t find a instruction video that does the cluster analysis with text files but  only with excel files with numerical and categorical variables so if anyone knows a good tutorial that would help too.

Thanks in advance already!
Tagged:
Sign In or Register to comment.