Error writing hdf5 file

DocMusherDocMusher Member Posts: 333 Unicorn
Hi, Please could someone explain why I receive an error writing an xls file into a hdf5 file. Is there another way to create hdf5 files in RM or external.
Thanks
Sven
The file is located here https://github.com/animra/Breast-Cancer-Detection-Using-Data-mining-Techniques-RapidMiner-Tool/blob/main/Breast%20Cancer%20Dataset.xlsx

Best Answer

  • jwpfaujwpfau Employee, Member Posts: 274 RM Engineering
    Solution Accepted
    Hi Sven,

    The Write HDF5 operator requires a full file path, i.e. "/Users/username/Desktop/breast cancer.hdf5"

    You can use the Folder symbol on the right side of the hdf5 file parameter to select the target folder.


    But since Version 9.7 the regular Store operator also writes in the hdf5 file format inside of your RapidMiner repositories.

    Greetings,
    Jonas

Answers

  • jwpfaujwpfau Employee, Member Posts: 274 RM Engineering
    edited February 2022
    Hi,

    you have to append ?raw=true to the url to receive the actual binary from github.


    <?xml version="1.0" encoding="UTF-8"?><process version="9.10.001">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.10.001" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="read_excel" compatibility="9.10.001" expanded="true" height="68" name="Read Excel" width="90" x="45" y="34">
            <parameter key="excel_file" value="https://github.com/animra/Breast-Cancer-Detection-Using-Data-mining-Techniques-RapidMiner-Tool/blob/main/Breast Cancer Dataset.xlsx?raw=true"/>
            <parameter key="sheet_selection" value="sheet number"/>
            <parameter key="sheet_number" value="1"/>
            <parameter key="imported_cell_range" value="A1"/>
            <parameter key="encoding" value="SYSTEM"/>
            <parameter key="first_row_as_names" value="true"/>
            <list key="annotations"/>
            <parameter key="date_format" value=""/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="locale" value="English (United States)"/>
            <parameter key="read_all_values_as_polynominal" value="false"/>
            <list key="data_set_meta_data_information"/>
            <parameter key="read_not_matching_values_as_missings" value="true"/>
          </operator>
          <operator activated="true" class="store" compatibility="9.10.001" expanded="true" height="68" name="Store" width="90" x="179" y="34">
            <parameter key="repository_entry" value="//Local Repository/data/excel"/>
          </operator>
          <connect from_op="Read Excel" from_port="output" to_op="Store" to_port="input"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
        </process>
      </operator>
    </process>



    Greetings,
    Jonas
  • DocMusherDocMusher Member Posts: 333 Unicorn
    Hi Jonas, using your approach I got the following message (see attachment)
    In my original approach I downloaded the excel file and imported it in RM, next I used the write HDF5 which generated in an error message (see second attachment).

    Thanks for any feedback
    Sven
  • DocMusherDocMusher Member Posts: 333 Unicorn
    Thanks!!
Sign In or Register to comment.