Options

Working with SPSS & RapidMiner

JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 578 Unicorn
edited November 2018 in Help
Hi all,

I was wondering if it is possible to work with a combination of SPSS & RapidMiner. 
Is it possible to create a model in SPSS and then export it with PMML (or another format) and then open this model into RapidMiner to be worked on, or vice versa? 

It would certainly save time without having to rebuild historically created models from scratch in RM & could also enable colleagues using the different systems to collaborate. 

Answers

  • Options
    earmijoearmijo Member Posts: 270 Unicorn
    I may be wrong but from what I know in its current state, RapidMiner is not a scoring engine. It can write PMML but it cannot interpret it. You might want to take a look at Augustus (open source at http://code.google.com/p/augustus/) or Adapa (comercial at http://adapasupport.zementis.com/). Hope this helps,

    Ernesto
  • Options
    wesselwessel Member Posts: 537 Maven
    I did not know the meaning of PMML.
    Here is what the wiki says:

    ----------
    The Predictive Model Markup Language (PMML) is an XML-based markup language developed by the Data Mining Group (DMG) to provide a way for applications to define models related to predictive analytics and data mining and to share those models between PMML-compliant applications.

    PMML provides applications a vendor-independent method of defining models so that proprietary issues and incompatibilities are no longer a barrier to the exchange of models between applications. It allows users to develop models within one vendor's application and use other vendors' applications to visualize, analyze, evaluate or otherwise use the models. Previously, this was very difficult, but with PMML, the exchange of models between compliant applications is straightforward.

    Since PMML is an XML-based standard, the specification comes in the form of an XML schema.
    ----------

    Sounds cool, but not sure I can think of a project where I would use PMML.
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    unfortunately we are currently not able to import PMML models. Actually I don't think it does a lot of sense in the current state, because most brain always goes into preprocessing the data. At least in my experience there's nearly never the case that you can apply any pmml supported model directly on the data.
    If you want to use RapidMiner in this fashion anyway, you might contact us for getting to know if we could make it possible.

    Greetings,
    Sebastian
  • Options
    JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 578 Unicorn
    Thanks guys, it took me on a bit of a tangent that I didn't expect, but great info from all.    :D
    I'll think I'll definitely look into Adapa for scoring some more. Their cloud based model sounds appealing to me .

    One question, although RapidMiner isn't a scoring engine would I be correct in saying that RapidAnalytics is suited for scoring using models created with RapidMiner? 

    Sebastian, I'll also send you an email regarding what we can 'make possible'. 
  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    well, RapidMiner is not a scoring engine in the sense of "Load in arbitrary PMML models and score our database" but of course you can use RapidMiner - as well as RapidAnalytics - to score your database with models natively created by RapidMiner (or Weka or R). So if you already have your models (in PMML), then going for ADAPA or other engines which can only used for scoring might indeed be the simplest solution.

    But things change of course if you also want to create such models or apply data transformations as well as Sebastian has pointed out. In that case the models and preprocessing models can best be applied directly from RapidMiner or RapidAnalytics since we support much more preprocessing steps than any other solution available.

    So it really depends on what you have already and what your data analysis looks like. But in simple scenarios with less data transformation and ETL and only creating a model and applying it on large data sets, a combination of RapidMiner / RapidAnalytics with a dedicated scoring engine like ADAPA might indeed be the best choice. From my experience the world is unfortunately not so simple in many cases  ;)

    Just my 2c. Cheers,
    Ingo

Sign In or Register to comment.