Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Predefined Topic lists
Hello everyone..
I have a question.
1- When I have predefined topic lists, which contains some words to extract the suitable topic of each Arabic documents.
I have a question.
1- When I have predefined topic lists, which contains some words to extract the suitable topic of each Arabic documents.
Cosine similarity is considered a good solution for this problem?
or latent Dirichlet allocation (LDA) ?
Please, could you guide me to do that in rapidminer?
Thanks.
Thanks.
0
Answers
@mschmitz is the resident expert on LDA (well at least he has written the operator) but I am pretty sure that is not going to help you here because I don't think you can feed the LDA algorithm a predefined set of topics.
So I am not actually sure what the best way to accomplish this would be. I guess you could put together a wordlist with the words for each predefined cluster and then try to build a polynominal classification model but that might not give you the output you really want. @mschmitz do you have another approach you would recommend here?
P.S. I don't think the language is really an issue, it has more to do with the structure of the problem.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
Dortmund, Germany