My problem is, that whenever I try to apply the learned Naive Bayes model, the original data from the export.csv produce corrupt records. I have linked a picture of the problem, with the specific blocks and results. Always the same column goes bad (UserReplyTime, which is used as label), and the Question marks and bad numbers appear at the same position. I am using windows1250 encoding. Thank you for your time,
update: Narrowing down the problem: if I copy the ReadCSV "export.csv" object manually, and connect the copy instead of the original's Multiply to the "unl" node, the original export.csv "goes back to normal". However, anything I connect to the said node produces the same problem with even more "?" for the said data... so I am basicly still at the same place.
update2: the same problem happens, if I try to use for example, the k-NN model. The same column has bad values (but in this case, the question marks disappear, but some of the 0's change to random numbers, like 3,4, and some other bigger bumbers, like 303933 changes back to 0. Please, somebody help me with this, I am really stuck.
As far as I can see from your pictures you take a sample of data to train the model and different sample of data to apply your model. You have no clue at all what your algorithm has learned. So applying such a model may give you such results. I would strongly suggest to use a validation operator together with an performance operator to see whether you are on the right way. You may also post your process as XML the next time.
Thank you for your reply! I did, as you said, now I try to use a Validation process. However, the same problem resist: if I connect the store to the Validation process, my original data seems corrupted at the "Eredeti Tábla" multiplyer. It is odd, that only the Labelled attribute goes corrupt. (inside the Valdiation I use naive Bayes, without Laplace correction)