Options

Why not edit metadata directly instead of requiring run-time postprocessing

crunchcrunch Member Posts: 3 Contributor I
edited November 2018 in Help
1) It is VERY frustrating to be forced to manually enter attribute labels when they are so often contained in a file or in the first row of a spreadsheet.  Why can't RapidMiner read in the column names from a CSV or similar file and have a check box indicating that the first row consists solely of labels (rather than data)?  It shouldn't burp when the first row contains column names instead of data, it should automatically recognize that the first row has way more type errors than other rows and treat it like a row of labels instead.  The MetaData View is darn close to a spreadsheet - why not allow users to change types and rename attributes right there?  As far as I can tell (still getting my sea legs with RapidMiner), screwing up a single attribute name or type can force you to reload the entire dataset and retype all of the labels from scratch or climb the learning curve and run a renaming filter in a process.  Talk about overkill; that learning curve is way out of proportion to the complexity of the task.  A smarter interface with Excel would also ameliorate this pain, just edit the values in excel and reimport the data set.

Answers

  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi,

    we are aware that the csv import in its current state is not optimal and intuitive, and we are working on a better solution. Anyway, everything you request is already possible:

    in the second step of the Import CSV wizard (actually of most wizards), you can set the Name annotation to the first row. That way it will recognize it as what it is - column labels.

    Furthermore, in the next step you can additionally rename the attributes manually to fit your needs.

    If you have further problems, please let us know.

    Best regards,
    Marius
Sign In or Register to comment.