🦉 🎤   RapidMiner Wisdom 2020 - CALL FOR SPEAKERS DEADLINE IS NOVEMBER 15   🦉 🎤
JSON file rotation
Raw JSON files often contain data for mulitple examples in a repeated array format. However, the current "JSON to Data" operator ignores that and simply imports all fields into a single row---in effect, ignoring the array structure and pretending that each JSON file contains only a single record.
It is possible, with a lot of extra post-processing effort, to turn that into a typical dataset, with separate rows for each example and the same attribute set for all examples, using a combination of Pivot, Transpose, Generate Attributes, Split, etc.. However, this transformation should really be an automatic part of the initial import process, or at least an option.
JSON files are becoming more and more popular as the returned format for API calls and web services, and it is a shame that RapidMiner handles them so poorly in its current implementation. Enhancing the Read JSON operator would go a long way to making it more functional for working with that type of semi-structured data.
Data Science Consulting from Certified RapidMiner Experts