Nowadays many corporate Excel files tend to be online. This seems quite natural, when you consider the wide range of benefits: ease of sharing and collaboration, access from any pc with Internet connection, user right management and constant backups, just to name a few. Wouldn't it be great to include data from those files directly into your Data Mining process, even without downloading the whole file? Now you can! Introducing the new Read Excel Online Operator. It is part of the latest update of the Spreadsheet Table Extraction extension (download link).
Companies and organizations often store and share information via Microsoft SharePoint Sites. They are a great way of collecting and sharing information around a given topic. Many sites therefore contain lots of office documents and files in other formats. Integrating these information into a Data Mining process often involves manual searching through sites and folders as well as downloading files by hand. This isn't fast, nor simple. Therefore we created the SharePoint Connector extension to speed things up. You can download it through the RapidMiner Marketplace. It consists of the List SharePoint Files operator, that creates a list of all available files and folders and the Download from SharePoint Operator which downloads files of interest.
We are currently working on some new data integration and enrichment operators to aid your data mining journey. Therefore, we are running a case study for testing our latest findings. Within this study you are given a short introduction with some guiding material to help you test some new RapidMiner operators. That’s it! Just test the operators in your current environment and tell us your findings and ideas.
One of the cool things about RapidMiner is the extension ecoystem. The default installation of RapidMiner Studio has a complete suite to do 90% of any ETL, Modeling, and Testing that you need to do on a daily basis. Sometimes you'll need that extra 10% to do something special, like Text Mining!
A few weeks ago the RapidMiner Research Team published two new extensions to the Marketplace that are making a splash, the Operator Toolbox and Converters! We didn't stop there! Today I'm happy to announce the release of the version 0.2.0 for both extensions! Here's a quick preview of the new enhancements you'll find!
As RapidMiner users we are used to one operator solutions. Want to add a PCA? Add the operator. Want to do an ensemble? Add the operator. Over time the RapidMiner ecosystem evolved in a way that most tasks are easy to handle like this. However, doing data science every day, I experienced a few things where RapidMiner has no one operator solution. How do we solve that?