Export CSV create scientific notation for very small reals

f_lapernaf_laperna Member Posts: 13 Contributor I
edited December 2018 in Help

Hi, I'm having this issue with RM when trying to export a dataset to a csv file. I have two columns with real numbers which can be very small and I noticed in the .csv file the scientific notation is used.

How can I turn it off and write the full number with all the decimal places? I checked the "Write CSV" node's documentation but didn't find anything about it. 

 

Thanks in advance.

Tagged:

Answers

  • SGolbertSGolbert RapidMiner Certified Analyst, Member Posts: 225   Unicorn

    Hi,

     

    I don't think it is possible with the Write CSV Operator. I think it is a rare use case.

     

    If you absolutely need to have this format, you will have to use one of the scripting opereators (R, Python, Groovy). As far as I know, it is not so simple in R either, you would have to change the default options for numerical conversion (with options(scipen = 15), meaning that it will write with fixed point notation up to 15 zeros).

     

    Regards,

    Sebastian

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 672   Unicorn

    Hi @f_laperna,

     

    Here you can find an example of process (to set according your data), using Python, to write the full number with all the decimal places in a .csv file.

    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.1.000" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="generate_data_user_specification" compatibility="8.1.000" expanded="true" height="68" name="Generate Data by User Specification" width="90" x="112" y="85">
    <list key="attribute_values">
    <parameter key="a" value="0.000000000015"/>
    <parameter key="b" value="0.00000000000000000016"/>
    </list>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="python_scripting:execute_python" compatibility="7.4.000" expanded="true" height="82" name="Execute Python" width="90" x="313" y="85">
    <parameter key="script" value="import pandas as pd&#10;from decimal import Decimal&#10;import csv&#10;&#10;# rm_main is a mandatory function, &#10;# the number of arguments has to be the number of input ports (can be none)&#10;def rm_main(data):&#10;&#10; file = 'C:/Users/Lionel/Documents/Formations_DataScience/Rapidminer/Tests_Rapidminer/Precision_Python/test_precision_Python.csv'&#10; precision = '%.20f'&#10; &#10; pd.set_option('precision',20) &#10; data = pd.DataFrame(data)&#10; &#10; data.to_csv(path_or_buf =file ,float_format = precision)&#10; &#10; &#10;&#10; &#10;&#10; # connect 2 output ports to see the results&#10; return data"/>
    </operator>
    <connect from_op="Generate Data by User Specification" from_port="output" to_op="Execute Python" to_port="input 1"/>
    <connect from_op="Execute Python" from_port="output 1" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

    I hope it helps,

     

    Best regards,

     

    Lionel

     

    sgenzer
Sign In or Register to comment.