Predefined Topic lists

Manar · April 2019

Hello everyone..
I have a question.
1- When I have predefined topic lists, which contains some words to extract the suitable topic of each Arabic documents.

Cosine similarity is considered a good solution for this problem?

or latent Dirichlet allocation (LDA) ?

Please, could you guide me to do that in rapidminer?
Thanks.

Telcontar120 · April 2019

This is an interesting question.
@mschmitz is the resident expert on LDA (well at least he has written the operator) but I am pretty sure that is not going to help you here because I don't think you can feed the LDA algorithm a predefined set of topics.

So I am not actually sure what the best way to accomplish this would be. I guess you could put together a wordlist with the words for each predefined cluster and then try to build a polynominal classification model but that might not give you the output you really want. @mschmitz do you have another approach you would recommend here?

P.S. I don't think the language is really an issue, it has more to do with the structure of the problem.

MartinLiebig · April 2019

Hi,

so you have a word list which contains key words. The more keywords are in a text, the more likely it should be in the topic?

That's not LDA.

Best,

Martin

Manar · April 2019

Ok , thank you..

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Predefined Topic lists

Answers