Options

Error reading Excel files - issue probably due to merged cells in first row

hervedelhallehervedelhalle Member Posts: 14 Contributor II
edited September 2020 in Help
Hello eveverybody,

I am simply trying to import data from Excel files.
Unfortunately, I am facing an issue using the read Excel function as shown in the next picture :


I notice that this is most probably due to the fact that cell 1A is merge with cell B1.
Indeed, when I try my process using "Unmerged.xlsx" file, it works while when I try my process using "Merge.xlsx" file, it fails.

Could you please help me to solve this issue ?

The example of the process and example files are available in attachment.

Best regards

Hervé

Answers

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Either you can skip that row entirely (and just add your attribute names after manually if needed), or you can unmerge the cells prior to reading it in.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    hervedelhallehervedelhalle Member Posts: 14 Contributor II
    Dear Telcontar120,

    Thank you for your reply. I have tried to skip this row using imported cell range parameter (value of A2:B4 for example) but I still got the error message. Is there another way to skip lines in Excel files ?

    Is there a way to unmerge cells in RapidMiner process ?

    Best regards

    Hervé
  • Options
    hervedelhallehervedelhalle Member Posts: 14 Contributor II
    Hello everybody,

    I have managed to solve my problem using the Python Scripting extensions.
    Unfortunate that RapidMiner module is not able to deal with merged cells in first row.

    Best regards.

    Hervé
  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    There is no way to have RapidMiner remove the merged cell formatting, but you can set the parameter in Read Excel by unchecking the "first row as names" option (see screenshot) and then go into the "Edit List" under annotations just below that and specify whatever rows you want to skip as comments.

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    hervedelhallehervedelhalle Member Posts: 14 Contributor II

    Thank you for your reply. Could you please let me known how to use annotations parameter to skip rows ?

    Best regards

    Hervé
Sign In or Register to comment.