Options

How to read...

pablucu5pablucu5 Member Posts: 15 Contributor II
edited November 2018 in Help
with RapidMiner the file format that can be found in this web? http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/

I wonder if it is possible. Here is an example: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/a1a

Answers

  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    sure. The format is called "sparse" format and hence it can be read with the operator "Read Sparse". Please note that you either have to define an attribute definition file (known as .aml) or at least specify the number of dimensions, in your example this would be 123.

    Here is a little example:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.1.006">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.1.006" expanded="true" name="Process">
        <process expanded="true" height="100" width="145">
          <operator activated="true" class="read_sparse" compatibility="5.1.006" expanded="true" height="60" name="Read Sparse" width="90" x="45" y="30">
            <parameter key="format" value="yx"/>
            <parameter key="data_file" value="C:\Users\Ingo Mierswa\Desktop\a1a.dat"/>
            <parameter key="dimension" value="123"/>
            <list key="prefix_map"/>
          </operator>
          <connect from_op="Read Sparse" from_port="output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Cheers,
    Ingo
Sign In or Register to comment.