Totally different results with model depending on storage method

kaymankayman Member Posts: 368   Unicorn
edited December 2018 in Help

Hi there,

 

are there limitations on which models you can safely store using the 'store model' operator? I noticed huge differences using a model stored as xml, and the same model stored directly in the server repository.

 

The model I used was the Weka MultiBayes, and originally I stored it as xml. Unfortunatly while my training / test results were more than ok, applying the model to new data was resulting in almost no match at all. When I tried to do the same thing with the same model stored on the repo they were as expected. So it seems as there is a huge difference with how models are read from xml as from repo.

I also wanted to try it with saving it as a binary rather than XML, but it won't even load it since it seems to expect an xml by default. Is there a special extention that needs to be used when saving as binary?

I can help myself for the time being by storing the model in the database, but since this is not always the best way to do for real heavy models I'd like to understand where I go wrong with using the 'store model' option.

Best Answer

  • sgenzersgenzer 12Posts: 2,446  Community Manager
    Solution Accepted

    yes I too have always had trouble storing Weka models (decision tree in my case).  Let me ask around and see if there's progress.

     

    Scott

     

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761   Unicorn

    WHy I can't comment on the storing XML vs Repo part, what I can say is that I've always had problems storing Weka extension derived models. I do remember maybe @Telcontar120 might have chimed in on this topic before. 

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,235   Unicorn

    Agreed, I have had issues both in viewing results and in storing models from various Weka modeling operators before.  I don't know what the root cause is, although I suspect it is simply some kind of deep software bug stemming from incompatibilities between Weka's and RapidMiner's internal configuration.  This has been true for many recent versions of RapidMiner.  Unfortunately I haven't ever really heard an explanation from tech support of the issue or a plan to remedy it either.

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761   Unicorn

    I think there was some discussion about updating that extension when I left RM back in July. 

  • kaymankayman Member Posts: 368   Unicorn

    Fair enough, I was just wondering if I did something wrong so glad to have that covered :-)

  • sgenzersgenzer 12Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,446  Community Manager

    ok received confirmation that Weka models do not read/write well.  It's a known issue.  You either need to deal with them on-the-fly or move to built-in RM models.

     

    Scott

     

Sign In or Register to comment.