"Import csv: missing

frankiefrankie Member Posts: 26 Contributor II
edited May 2019 in Help

I have RapidMiner installed on three computers. On two of these I have the very latest version and on these I am lacking the option to use the first row in a data file as the variable names. When I try to import a file the names in the first row just get added to the data and there is nothing I can do about it. There is no check box which toggles the use of variable names. A feature which has existed on earlier versions and exists on the 3rd computer (running 5.0.1, I think) . The toggle isn't hidden, I cannot get it visible by enlarging the import wizard window.




  • haddockhaddock Member Posts: 849 Maven
    Hi Frankie,

    I mainly mine databases on the latest version of RM, but occasionally use it to import CSVs into my data heap; I seem to remember the crafty camouflage on 'first row as names' tick-box in the 'Read CSV' operator - it isn't in the wizard, but you need to scroll down the parameters to see it!

    Have fun!

  • frankiefrankie Member Posts: 26 Contributor II
    Thanks, but it was in the wizard and it still is in the wizard on my 5.0 installation of RapidMiner.
    I seldom use the operator so I prefer the wizard for importing new data.
  • colocolo Member Posts: 236 Maven
    Hi frankie,

    the wizard should allow you to set "Annotation" to "name" in step 3 (just click the table cell). If you do this for the first row, this one will be excluded from the data and name all the attributes.

    Btw: It seems that the parameter checkbox "use first row as names" for "Read CSV" ignores changes after using the wizard - which can be really annoying.

  • frankiefrankie Member Posts: 26 Contributor II
    Thanks, that clear it up.
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    yes this can be annoying, but at least it should post a warning that this will be ignored, does it?

  • colocolo Member Posts: 236 Maven
    Hi Sebastian,

    I just tested it to be sure. Neither the "Problems" nor the "Log" tab provides any information or warning about this.
    Perhaps you could improve the behavior of the parameter in combination with the use of the wizard.

    If I do not set the name annotation for the first row in the wizard and then activate the paramter "use first row as names" afterwards, the first row is excluded from the example set but the attribute keeps the old name "att1". This name seems to be set by the wizard inside the "data set meta data information" parameter list and is not updated when changing the other parameter.

    The other way around, if I set the name annotation inside the wizard, the "use first row as names" parameter (which is not activated from the wizard's settings automatically) has absolutely no effect. Checking or unchecking it doesn't make any difference for the resulting example set.

    So you have no other chance for changing the resulting data than using the wizard again. In my opinion wizards should be a help or guide to set a bunch of parameters in an easier way. But all the choices inside the wizard should be reflected in the parameters and allow manual adjustment afterwards.

Sign In or Register to comment.