[solved] Import excel charset/codepage

greggreg Member Posts: 23 Contributor II
edited November 2018 in Help
Hello

I'm working from an xls file, codepage 1252. When Importing it into RM, all accentuated characters appear as question marks "?". If I try to open it in OpenOffice they appear as they should be.

How can I fix this?

TIA

greg

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Did you set the "enconding" parameter of the Read Excel operator? (to see the parameter you have to enter expert mode by clicking the guy with the hat on top of the parameter list)

    Best, Marius
  • greggreg Member Posts: 23 Contributor II
    Than ks for the quick answer :)

    I'm always running in expert mode ; however I didn't user the "read excel" operator, I used "file->import data->import excel sheet". Should I use the operator instead? I used the repository feature because I assumed it would make things smoother when I'm sent an updated excel file.

    greg
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hm, the import wizard seems to be missing the encoding option. But especially when you frequently update the excel file, it would be easier to create a process with the Read Excel operator, configure it once and use a Store operator to store the result in the repository. That way you only have to re-execute your process to push the updated data into the repo.

    Best, Marius
  • greggreg Member Posts: 23 Contributor II
    OK thanks I tried using the operator ; I don't have any "encoding" option, but I have "time zone" and "locale". I set them to "paris" and "french", same result, the accentuated characters are ?....
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Which version of RapidMiner are you using? If it is less than 5.2.6 please update to the latest version.

    Timezone and locale are only for the date formatting.
  • greggreg Member Posts: 23 Contributor II
    Hum sorry, I was assuming the auto updater would keep me to date, but it seems it didn't.

    I manually downloaded the last version, set the encoding parameter and now it's working fine :)

    Thanks a lot!!

    greg
Sign In or Register to comment.