Why not edit metadata directly instead of requiring run-time postprocessing

crunch · December 2012

1) It is VERY frustrating to be forced to manually enter attribute labels when they are so often contained in a file or in the first row of a spreadsheet. Why can't RapidMiner read in the column names from a CSV or similar file and have a check box indicating that the first row consists solely of labels (rather than data)? It shouldn't burp when the first row contains column names instead of data, it should automatically recognize that the first row has way more type errors than other rows and treat it like a row of labels instead. The MetaData View is darn close to a spreadsheet - why not allow users to change types and rename attributes right there? As far as I can tell (still getting my sea legs with RapidMiner), screwing up a single attribute name or type can force you to reload the entire dataset and retype all of the labels from scratch or climb the learning curve and run a renaming filter in a process. Talk about overkill; that learning curve is way out of proportion to the complexity of the task. A smarter interface with Excel would also ameliorate this pain, just edit the values in excel and reimport the data set.

MariusHelf · January 2013

Hi,

we are aware that the csv import in its current state is not optimal and intuitive, and we are working on a better solution. Anyway, everything you request is already possible:

in the second step of the Import CSV wizard (actually of most wizards), you can set the Name annotation to the first row. That way it will recognize it as what it is - column labels.

Furthermore, in the next step you can additionally rename the attributes manually to fit your needs.

If you have further problems, please let us know.

Best regards,
Marius

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Why not edit metadata directly instead of requiring run-time postprocessing

Answers