RapidMiner 9.7 is Now Available

Lots of amazing new improvements including true version control! Learn more about what's new here.

CLICK HERE TO DOWNLOAD

Changing data type of label

ml_guyml_guy Member Posts: 11 Contributor II
edited November 2018 in Help
Hello everybody,

I am trying to change the data type from int to boolean of a CSV data source. When I apply Numerical2Binomial operator on the data, it converts the features but leaves the label. Thus I am unable to run Naive Bayes on it.

How can I change the data type of label to boolean (for classification tasks).

Thanks in advance.

Answers

  • earmijoearmijo Member Posts: 263   Unicorn
    It's very simple. You can use AttributeSubsetPreprocessing

    Suppose your data are:

    label x1 x2
    0 1 0.5
    0 2 0.7
    1 3 0.8
    1 4 0.9


    You can use the following code. Note that in the AttributeSubsetPreprocessing I've cheched "Process_Special_Attributes". Labels are special attributes.
    <operator name="Root" class="Process" expanded="yes">
        <operator name="CSVExampleSource" class="CSVExampleSource">
            <parameter key="filename" value="c:\data.csv"/>
            <parameter key="label_name" value="label"/>
        </operator>
        <operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
            <parameter key="condition_class" value="attribute_name_filter"/>
            <parameter key="parameter_string" value="label"/>
            <parameter key="attribute_name_regex" value="label"/>
            <parameter key="process_special_attributes" value="true"/>
            <operator name="Numerical2Binominal" class="Numerical2Binominal">
            </operator>
        </operator>
    </operator>
  • ml_guyml_guy Member Posts: 11 Contributor II
    Thanks earmijo.

    It was really helpful.

    Nice  :)
Sign In or Register to comment.