Options

SQL pushback

PrekoPreko Member Posts: 21 Contributor II
edited June 2019 in Help
Hi,

At RCOMM, I have heard (or maybe just wanted to hear) that SQL pushback will be included in a future version of RapidMiner.
It was mentioned that RapidMiner will be able to use Ingres VectorWise's analytical features to speed up the analysis process. I guess a standard way to do that is the generate an SQL query from the operator flow and run it in the database engine. I would like to know how and when it will be implemented, because we are considering a similar feature in our extension, and I would like to avoid double work.

Thanks, Zoltan

Answers

  • Options
    jackofalltradesjackofalltrades Member Posts: 2 Contributor I
    Is this similar to SPSS Modeler SQL Pushback:
    SQL Pushback

    Where an IBM® SPSS® Modeler stream reads data from a SQL database and performs processing on the data, advanced users can improve the efficiency of this operation by pushing back the SQL instructions to execute in the database itself.

    Several standard SPSS Modeler nodes support SQL pushback, and the server-side API includes function calls to make this possible for CLEF nodes as well.

    The clemext_peer_getSQLGeneration service function generates SQL from a peer instance and is used to push back SQL execution to the database. For a data reader node, the generated SQL must be sufficient on its own to create the peer result set. For any other type of node, the generated SQL will most likely depend on the SQL generated for upstream nodes that provide input to the peer. A peer can obtain the upstream SQL by calling the clemext_node_getSQLGeneration callback function on its associated node handle.
    (http://pic.dhe.ibm.com/infocenter/spssmodl/v15r0m0/index.jsp?topic=%2Fcom.ibm.spss.modeler.help%2Fclef_prog_ssapi_features_sqlpush.htm)

    This seems similar to to RapidAnalyzer's In database mining:
    In-database-mining: Instead of taking the data to the algorithm, this extension supports taking the algorithms to the data. Thus the execution of analyses, and in particular of a scoring, is directly supported within databases. Until now, such a solution has only been available from individual database providers such as Oracle and IBM DB2 and on a very limited Basis. Rapid-I now offers this solution for numerous analysis procedures and database-wide.
    (RapidAnalyzer Factsheet)

    Am I completely off-the-mark with the analogy?
Sign In or Register to comment.