Options

"LDA Topic Modeling - Topics description after Applying LDA Model on new documents"

svtorykhsvtorykh Member Posts: 35 Guru
edited June 2019 in Help

Happy Friday RM team!

 

I have the following issue: I run LDA Topic Modeling process and able to get Topics Description using Top output (words for topics) of LDA process. I'm also able to save generated LDA model. However, once I apply this LDA model on set of new documents, I have no idea which topics mean what. I suspect that Topic IDs are not the same between my original topics description coming from LDA process to generate the model and topic ids that are generated after model application on new documents set. Can you please clarify? Is it possible to have same topic ids generated? I'm also using reproducible and random seed option when model is generated during original LDA process. Thanks!

Answers

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,510 RM Data Scientist

    Hi @svtorykh,

     

    i think applying the model should yield to exactly the same topic id -> topic "meaning" relationships. Everything else would clearly surprise me. Is it possible to generate an example to demonstrate this?

    Note: If you apply the model on the training data it's normal that the derived probabilities change slightly w.r.t. the original ones. This is due to the Gibbs-Sampling Algorithm.

     

    Best,

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.