Finding the right data is crucial – Help us improve new Data Discovering Techniques
We are currently working on some new data integration and enrichment operators to aid your data mining journey. Therefore, we are running a case study for testing our latest findings. Within this study you are given a short introduction with some guiding material to help you test some new RapidMiner operators. That’s it! Just test the operators in your current environment and tell us your findings and ideas.
Objectives to test
- Data Enrichment via Data Search Extension
- PDF Table Extraction
- Google Spreadsheet Extraction
- Web Table Extraction
If you are interested contact us via [email protected] and we will get in touch with you. You are asked to hand in observations within four weeks.
Finding the right data is a crucial step in every data mining project. Data is often distributed across different places and obtaining it might be difficult. Hence integrating various formats and sources is key. In the research project ‘Data Search for Data Mining’ we are investigating new ways of aiding this process by making data from previously unavailable sources easily available within RapidMiner. Possible sources are for example tables stored in PDFs or Google Spreadsheets.
And what about those data sets you already have but are unaware of? For that, we’re working together with the University of Mannheim to enrich data sets with existing data from internet and intranet sources in a (semi-)automatic way.
Philipp for the RapidMiner Research Team