Rapidminer can work like a Data Warehouse or can we can get datamart

bajhamethbajhameth Member Posts: 2 Newbie
or we need a data warehouse for best experience.

Answers

  • rfuentealbarfuentealba Moderator, RapidMiner Certified Analyst, Member, University Professor Posts: 568 Unicorn

    Let's review a few concepts here.

    Data, especially structured can be stored or manipulated.

    Data can be stored in repositories, and those have some types (classification is mine, BTW, you may find some of this in textbooks):
    • Spreadsheets, like Microsoft Excel, Apple Pages, Gnumeric, LibreOffice Calc, etc.
    • Database Files, like DBase/Clipper files.
    • Relational Databases, normally with SQL or something similar: PostgreSQL, MySQL, Oracle...
    • Object Databases, normally with GraphQL or something similar: Neo4j.
    • Data Warehouses, a huge relational database that contains information from other databases.
    • Data Marts, a slice of the data warehouse intended to provide more insights to a special group.
    • Data Lakes, normally a combination of all the above. Usually, Big Data is or contains a number of data lakes.
    RapidMiner allows you to manipulate data that has been stored in different repositories. It does store data, but it looks more like the database files defined above. RapidMiner can help you building a new data warehouse, or interacting with your data lake, or creating a data mart for your own consumption. Creating ETL processes, imputing data, replacing data, creating new features, etc.

    You don't need an expensive data warehouse to take advantage of RapidMiner. You can work with whatever you have, and make sure that since RapidMiner is not in the business of keeping the data that is stored but in the business of providing a large number of useful data science methods to analyze that data, you shouldn't be afraid of using it.

    All the best,

    Rodrigo.


  • SGolbertSGolbert RapidMiner Certified Analyst, Member Posts: 344 Unicorn
    The short answer is no. RM is a tool for machine learning with good ETL and data exploration capabilities. It is not a database or a data warehouse.

    Regards,
    Sebastian
Sign In or Register to comment.