The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

The Sudden Interest in Data Science Platforms

Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
edited November 2018 in Knowledge Base

I've been at this startup thing for a few years now and I've seen a thing or two. If you read KDNuggets, you'll stumble across the Gartner Hype Cycle. Right now Big Data is entering the trough of disillusionment. While that sounds sad, it kinda makes sense.



 



For years we've hearing how Big Data will unlock all kinds of insights in a corporation's data. Everyone raced to stand up clusters, jam all kinds of data into them, and then stumble when extracting insight. The cluster became hard to tame, hard to use, and seemed like a big waste of money.



 



Of course RapidMiner Radoop came along and actually delivered on this promise but many companies decided to use a single tool to extract their insight. Maybe it was PySpark or Pig Script? Maybe something else completely. They married themselves to one or two ways of getting insight.



 



Now many companies are realizing they're not just an R shop, they're an R, Python, and Spark shop. Now they need to use all three or more tools in the Data Science toolkit to get anything done. Now they're looking around for a platform to bring all these tools together.



 



Imagine their surprise when they find RapidMiner. We've been a Data Science platform from day 1. Ninety percent of the time you can do all your data science and model building right in the Studio platform. The rest of the time you might need some esoteric algorithm to finish your work. So, if you married yourself to one tool and that esoteric algorithm wasn't available, you were SOL.

 

With RapidMiner it's always been different. Need that Tweedie algorithm in R? Use the R Scripting extension and pull it in. Need to do some PySpark on your cluster? Put that script right inside Radoops Spark Script operator.



 



It's that easy. After all, isn't that what a real Data Science platform is supposed to do?

Sign In or Register to comment.