Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

[SOLVED]Read 'Transaction' file problem

wujiangwujiang Member Posts: 12 Contributor II
edited November 2018 in Help
Hello,

I want read a file like this:

a b c d e f g h i j k l m n o p q r s t u v w x y z aa bb cc dd
30 31 32
33 34 35
36 37 38 39 40 41 42 43 44 45 46
38 39 47 48
38 39 48 49 50 51 52 53 54 55 56 57 58
32 41 59 60 61 62
3 39 48
63 64 65 66 67 68
32 69
48 70 71 72

I use the 'Read CSV' operator, and split by the ' '

so I got a dataset like

30 31 32  ? ?  ? ? ?....
33 34 35  ? ?  ? ....
36 37 38 39 40 41 42 43 44 45 46
...

once I click 'OK' ,  rapidminer will prompts me a 'error Message:'

Message: An attribute 2 was specified for column 2, but this column does not exist in input data.

What I want is a integer array for these data, How can I deal with it ? ??? ??? ???

Thanks in advance.

Answers

  • frasfras Member Posts: 93 Contributor II
    Should do the job:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.0.003">
      <context>
        <input/>
        <output/>
        <macros>
          <macro>
            <key>input</key>
            <value>http://pastebin.com/raw.php?i=3guyR6H5</value&gt;
          </macro>
        </macros>
      </context>
      <operator activated="true" class="process" compatibility="6.0.003" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="read_csv" compatibility="6.0.003" expanded="true" height="60" name="Read CSV" width="90" x="45" y="75">
            <parameter key="csv_file" value="C:\Users\fras\AppData\Local\Temp\rm_file_6786169822975474295.dump"/>
            <parameter key="column_separators" value="'\n&quot;"/>
            <parameter key="first_row_as_names" value="false"/>
            <list key="annotations"/>
            <parameter key="encoding" value="windows-1252"/>
            <list key="data_set_meta_data_information">
              <parameter key="0" value="att1.true.polynominal.attribute"/>
            </list>
          </operator>
          <operator activated="true" class="split" compatibility="6.0.003" expanded="true" height="76" name="Split" width="90" x="246" y="75">
            <parameter key="split_pattern" value="\s"/>
          </operator>
          <operator activated="true" class="rename_by_example_values" compatibility="6.0.003" expanded="true" height="76" name="Rename by Example Values" width="90" x="447" y="75"/>
          <connect from_op="Read CSV" from_port="output" to_op="Split" to_port="example set input"/>
          <connect from_op="Split" from_port="example set output" to_op="Rename by Example Values" to_port="example set input"/>
          <connect from_op="Rename by Example Values" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>

  • wujiangwujiang Member Posts: 12 Contributor II
    Thanks fras,

    It works, but The "rename by exaple values" shows "error " like
    The exampleset must contain at least 1 examples with parameter "row_number" set to "1".  Event the error exist, your program woks :)

    Actually I use RM not for a long time, it's the first time I use XML of RM, could you explain a little bit about your answer?

    I have to deal with the output of a example Set (the output data is the input data for my own operator), I can't use '?' value, so, do you have any idea to slove this problem, I tried to replace '?' by a '-1', but the '?' can't be known by "replace missing value".

    Now I use 'Nominal to Numerial' to transform "String" to "Integer". But what I get is "0 0 0 0 0 0 0 0" and "111...."

    I use the wrong Operater to generate the Integer Type?
Sign In or Register to comment.