RapidMiner

Totally different results with model depending on storage method

SOLVED
Maven
Maven

Totally different results with model depending on storage method

Hi there,

 

are there limitations on which models you can safely store using the 'store model' operator? I noticed huge differences using a model stored as xml, and the same model stored directly in the server repository.

 

The model I used was the Weka MultiBayes, and originally I stored it as xml. Unfortunatly while my training / test results were more than ok, applying the model to new data was resulting in almost no match at all. When I tried to do the same thing with the same model stored on the repo they were as expected. So it seems as there is a huge difference with how models are read from xml as from repo.

I also wanted to try it with saving it as a binary rather than XML, but it won't even load it since it seems to expect an xml by default. Is there a special extention that needs to be used when saving as binary?

I can help myself for the time being by storing the model in the database, but since this is not always the best way to do for real heavy models I'd like to understand where I go wrong with using the 'store model' option.

6 REPLIES
RM Certified Expert
RM Certified Expert

Re: Totally different results with model depending on storage method

WHy I can't comment on the storing XML vs Repo part, what I can say is that I've always had problems storing Weka extension derived models. I do remember maybe @Telcontar120 might have chimed in on this topic before. 

RM Certified Expert
RM Certified Expert

Re: Totally different results with model depending on storage method

Agreed, I have had issues both in viewing results and in storing models from various Weka modeling operators before.  I don't know what the root cause is, although I suspect it is simply some kind of deep software bug stemming from incompatibilities between Weka's and RapidMiner's internal configuration.  This has been true for many recent versions of RapidMiner.  Unfortunately I haven't ever really heard an explanation from tech support of the issue or a plan to remedy it either.

Brian T., Lindon Ventures - www.lindonventures.com
Analytics Consulting by Certified RapidMiner Analysts
RM Certified Expert
RM Certified Expert

Re: Totally different results with model depending on storage method

I think there was some discussion about updating that extension when I left RM back in July. 

Community Manager Community Manager
Community Manager
Solution

Re: Totally different results with model depending on storage method

yes I too have always had trouble storing Weka models (decision tree in my case).  Let me ask around and see if there's progress.

 

Scott

 

Scott Genzer
Senior Community Manager
RapidMiner, Inc.
Highlighted
Maven
Maven

Re: Totally different results with model depending on storage method

Fair enough, I was just wondering if I did something wrong so glad to have that covered :-)

Community Manager Community Manager
Community Manager

Re: Totally different results with model depending on storage method

ok received confirmation that Weka models do not read/write well.  It's a known issue.  You either need to deal with them on-the-fly or move to built-in RM models.

 

Scott

 

Scott Genzer
Senior Community Manager
RapidMiner, Inc.