Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

[SOLVED]numeric values in csv files was not correctly imported using readcsv

huaiyanggongzihuaiyanggongzi Member Posts: 39 Contributor II
edited November 2018 in Help
I am using the "read csv" operator to import csv, and found several issues of using it. The following is an illustration process, read a csv and write it to another csv. In the original csv, the first column is
3E+11
, but the output csv file get empty cell for the corresponing position.


Here is the input csv file
column1 column2
3.10E+11 F
3.10E+11 F
Here is the output csv file
column1 column2
                   F
                   F
The following is the process script
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.008">
 <context>
   <input/>
   <output/>
   <macros/>
 </context>
 <operator activated="true" class="process" compatibility="5.3.008" expanded="true" name="Process">
   <description>This getting started process shows the first step of learning and storing a model.
After a model is learned, you can load (Retrieve operator) the model and apply it to a test data set (see 2. Getting Started: Retrieve and Apply Model). The process is NOT concerned with evaluation of the model.

This process will not immediately run in RapidMiner because you have to adjust the repository path in the Retrieve operator.

Tags: Rapidminer, model, learn, learning, store, first step</description>
   <process expanded="true">
     <operator activated="true" class="read_csv" compatibility="5.3.008" expanded="true" height="60" name="Read CSV" width="90" x="45" y="75">
       <parameter key="csv_file" value="C:\Users\Desktop\test\test7.csv"/>
       <parameter key="column_separators" value=","/>
       <parameter key="first_row_as_names" value="false"/>
       <list key="annotations">
         <parameter key="0" value="Name"/>
       </list>
       <parameter key="encoding" value="GBK"/>
       <list key="data_set_meta_data_information">
         <parameter key="0" value="column1.true.real.attribute"/>
         <parameter key="1" value="column4.true.binominal.label"/>
       </list>
     </operator>
     <operator activated="true" class="write_csv" compatibility="5.3.008" expanded="true" height="76" name="Write CSV" width="90" x="514" y="165">
       <parameter key="csv_file" value="C:\Users\Desktop\test\test7-copy.csv"/>
       <parameter key="column_separator" value=","/>
     </operator>
     <connect from_op="Read CSV" from_port="output" to_op="Write CSV" to_port="input"/>
     <connect from_op="Write CSV" from_port="through" to_port="result 1"/>
     <portSpacing port="source_input 1" spacing="0"/>
     <portSpacing port="sink_result 1" spacing="0"/>
     <portSpacing port="sink_result 2" spacing="0"/>
   </process>
 </operator>
</process>

Answers

  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,996 RM Engineering
    Hi,

    thank you for reporting this. Looks like you stumbled across a bug. I have created a ticket in our internal issue tracker for it.

    Regards,
    Marco
Sign In or Register to comment.