Is there a way to measure the performance of a Word2Vec model?

Christos_KarapapasChristos_Karapapas Member Posts: 25 Contributor II
edited January 2020 in Help
I am using the word2vec extension to train a model for polynomial text classification.
None of the standard Performance operators seem to "stack" with the word2vec model (RMWord2VecModel).

Is there any way that I could measure the performance of the model on the training dataset?

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist
    good question! Usually Word2Vec is either evaluated empircally (close words should be "synonyms", close in meaning) or by using down stream tasks like Entity Recognition.

    Cheers,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Christos_KarapapasChristos_Karapapas Member Posts: 25 Contributor II
    So, if I understand it correctly it's just another way of finding synonyms just like when grouping words by their root (lemmas)?

    And if so why it exports a model? To be used in later processes with different datasets and still be able to find the lemma of a word?
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Yes, it is more of a processing model than a predictive model.  Most unsupervised ML approaches don't have th same kind of performance measurement as you are used to from standard predictive models.

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.