Extract Topic form Data (LDA) operator and optimize hyperparameter settings

Chidi_Opara1Chidi_Opara1 Member Posts: 1 Newbie
I am working on a text mining using the Extract Topic from Data(LDA) operator and i am finding difficult understanding how to use the optimize hyperparameter setting? 
Specifically the "Optimize interval for hyperparameter" and " iterations" and how they affect the results
i am currenly using the default alpha and beta heuristics. what is the effect of changing these values?  
Jasmine_

Answers

  • jwpfaujwpfau Employee, Member Posts: 264 RM Engineering
    In general the number of sampling iterations should collerate with the model quality and runtime.

    Optimize interval for hyperparameter defines the number of iterations between hyperparameter optimizations, if you do this too often (low value) you might end up with instabilities due to alpha hyperparameters going to zero.

    α influences the number of topics per document
    β influences the number of words per topic

    If you want to have some more sound knowledge about LDA:
    Or maybe @mschmitz can correct me
    Jasmine_varunm1sgenzer[Deleted User]
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,453 RM Data Scientist
    edited March 2020
    Wow, @jwpfau. I didn't know that you as an engineer are so much into DS. Thats of course great!
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
    varunm1sgenzerJasmine_
Sign In or Register to comment.