"Word Vectors"

Legacy UserLegacy User Member Posts: 0 Newbie
edited June 2019 in Help
I guess I am using the text plugin to create word vectors.  I have the tutorial that Michael Wurst completed call "The Word Vector Tool and the RapidMiner Text Plugin".  I can output a file that looks like the following
                                                                                                                                                                                                                     Trailer          Lights
TRAILER MARKER LIGHTS WILL NOT WORK WHEN TRUCK LIGHTS ARE ONDEFECTIVE TRAILER CORD  R&R TRAILER CORD                  3                 1
CHECK TRAILER PLUGS FOR PROPER WIRING  TRACE PROBLE TO LOCK ON PLUG CONNECTOR AT BACK OF SLEEPER                        1                 0  
CONDITION:NO POWER TO TRAILER LIGHTS  CAUSE:CHECKED TRAILER CORD (OK) TRACED WIRING FROM BACK OF                           2                 1

Question I have....a true word vector from what I understand is to also have the words in the first column a words that are next to each other also count?

an example base on the comment lines would be

                        Trailer       Lights       Cord         Plug         Sleeper
Trailer                  6              1               2              0               0
Lights                  6              3               0              0               0
Cord                    3              0               3              0               0
Plug                     1              0               0              2               0
Sleeper                0             0               0              0                1


Has anybody done this?

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    I'm not sure if I understood what you are trying to explain. But I'm quite sure that a word vector consists of a vector and hence does only provide one number per word. What you showed as example somehow looks more like a matrix.

    In principle each word of a text is counted within the word vector. But some words might be to infrequent and are discarded totally because they only exist in one text at all...

    Greetings,
      Sebastian
Sign In or Register to comment.