Different algorithm for modeling and nested in GA for feature selection

IatiiIatii Member Posts: 8 Contributor I
edited September 3 in Help

Answers

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,107  RM Data Scientist

    Hi,

     

    can you maybe post your proess?

     

    ~Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • IatiiIatii Member Posts: 8 Contributor I

    Sure

     

    image

     

  • IatiiIatii Member Posts: 8 Contributor I
  • IatiiIatii Member Posts: 8 Contributor I
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,107  RM Data Scientist

    Dear lattii,

     

    to be honest your process looks a bit odd in general. You apply the k-nn on the learned data in the GA, which leads to overtraining. Further you used a split validation by hand w/o taking the Ga into account, this is again something which yields to overtraining. I would suggest to put a cross validation around everything. It is further a bit strange to use a naive bayes to generate features and a k-NN for classification - but if it works, it works.

     

    ~Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
    Iatii
  • IatiiIatii Member Posts: 8 Contributor I
    thanks for your guidance,

    Do you think if I delete the whole K-NN process in GA and then bring the modeling process in GA (delete the splitting operator and using split validation), it work true ?

    and about the model discussed, you are saying if my model works, so it is true. In this case the feature selection is done using K-NN or NB ? I get a little confused. I am new user to RM, so sorry for these questions.
Sign In or Register to comment.