Totally different results with model depending on storage method
Hi there,
are there limitations on which models you can safely store using the 'store model' operator? I noticed huge differences using a model stored as xml, and the same model stored directly in the server repository.
The model I used was the Weka MultiBayes, and originally I stored it as xml. Unfortunatly while my training / test results were more than ok, applying the model to new data was resulting in almost no match at all. When I tried to do the same thing with the same model stored on the repo they were as expected. So it seems as there is a huge difference with how models are read from xml as from repo.
I also wanted to try it with saving it as a binary rather than XML, but it won't even load it since it seems to expect an xml by default. Is there a special extention that needs to be used when saving as binary?
I can help myself for the time being by storing the model in the database, but since this is not always the best way to do for real heavy models I'd like to understand where I go wrong with using the 'store model' option.
Best Answer
-
sgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager
yes I too have always had trouble storing Weka models (decision tree in my case). Let me ask around and see if there's progress.
Scott
0
Answers
WHy I can't comment on the storing XML vs Repo part, what I can say is that I've always had problems storing Weka extension derived models. I do remember maybe @Telcontar120 might have chimed in on this topic before.
Agreed, I have had issues both in viewing results and in storing models from various Weka modeling operators before. I don't know what the root cause is, although I suspect it is simply some kind of deep software bug stemming from incompatibilities between Weka's and RapidMiner's internal configuration. This has been true for many recent versions of RapidMiner. Unfortunately I haven't ever really heard an explanation from tech support of the issue or a plan to remedy it either.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
I think there was some discussion about updating that extension when I left RM back in July.
Fair enough, I was just wondering if I did something wrong so glad to have that covered :-)
ok received confirmation that Weka models do not read/write well. It's a known issue. You either need to deal with them on-the-fly or move to built-in RM models.
Scott