on-the-fly attribute conversion (xxx2String)

rpanizzarpanizza Member Posts: 5 Contributor II
edited November 2018 in Help
Hi.
There are several nodes to convert an attribute after the loading (Numeric2Polynomial, Numeric2Binomial and so on).
So, how can I convert some columns to string type?
According to me, this is useful because I want to group by data by 2 or more columns and in order to do this I would create a new column that is the pipe of the 2 or more group by columns.

Finally: where can I find a list of function such as +, *, sin, conversion functions...

Answers

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    since you already found the operator Numeric2Polynominal I am assuming that you have something different in mind. Could you give a small example what you want to achieve? I am not sure if I totally got your point.

    Finally: where can I find a list of function such as +, *, sin, conversion functions...
    There is an operator "FeatureGeneration" in the "Preprocessing" -- "Attributes" -- "Generation" group. Most mathemtical functions supported by Java are also supported by this operator: 

    "+", "-", "*", "/", "1/", "sin", "cos", "tan", "atan", "exp", "log", "min", "max", "floor", "ceil", "round", "sqrt", "abs", "sgn", "pow"

    Constant values are also defined as a function in the format "const[5]()" which represents the constant "5".

    The next release (or one of the next ones) will also provide an operator for string-based constructions like substrings etc.

    Cheers,
    Ingo

  • rpanizzarpanizza Member Posts: 5 Contributor II
    mierswa wrote:

    Hi,

    since you already found the operator Numeric2Polynominal I am assuming that you have something different in mind. Could you give a small example what you want to achieve? I am not sure if I totally got your point.
    I want to perform this operation (for example):
    to sum a variable (say VEND) by other 2 variables (BRAND and FAMI).
    the sql equivalent is:
    select brand, fami, sum(vend) as totvend from tab1 group by brand, fami
    In order to do this I want to create a column equal to the concatenation of the value of columns BRAND and FAMI (I need the concatenated column because rapid miner don't allow to group by with more than one column)

    Thanks.
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    ah, thanks. Now I see. This is actually possible with the AttributeMerge operator like in the following example (the first 80% of the example are just to create a data set similar to yours):

    <operator name="Root" class="Process" expanded="yes">
        <operator name="DataGeneration" class="OperatorChain" expanded="no">
            <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
                <parameter key="attributes_lower_bound" value="10.0"/>
                <parameter key="attributes_upper_bound" value="50.0"/>
                <parameter key="number_of_attributes" value="3"/>
                <parameter key="target_function" value="sum"/>
            </operator>
            <operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
                <parameter key="attribute_name_regex" value="att2||att3"/>
                <operator name="BinDiscretization" class="BinDiscretization">
                    <parameter key="use_long_range_names" value="false"/>
                </operator>
            </operator>
            <operator name="ChangeAttributeName" class="ChangeAttributeName">
                <parameter key="new_name" value="brand"/>
                <parameter key="old_name" value="att2"/>
            </operator>
            <operator name="AttributeValueMapper" class="AttributeValueMapper">
                <parameter key="attributes" value="brand"/>
                <parameter key="replace_by" value="brand1"/>
                <parameter key="replace_what" value="range1"/>
            </operator>
            <operator name="AttributeValueMapper (2)" class="AttributeValueMapper">
                <parameter key="attributes" value="brand"/>
                <parameter key="replace_by" value="brand2"/>
                <parameter key="replace_what" value="range2"/>
            </operator>
            <operator name="ChangeAttributeName (2)" class="ChangeAttributeName">
                <parameter key="new_name" value="fami"/>
                <parameter key="old_name" value="att3"/>
            </operator>
            <operator name="AttributeValueMapper (3)" class="AttributeValueMapper">
                <parameter key="attributes" value="fami"/>
                <parameter key="replace_by" value="fami1"/>
                <parameter key="replace_what" value="range1"/>
            </operator>
            <operator name="AttributeValueMapper (4)" class="AttributeValueMapper">
                <parameter key="attributes" value="fami"/>
                <parameter key="replace_by" value="fami2"/>
                <parameter key="replace_what" value="range2"/>
            </operator>
            <operator name="ChangeAttributeName (3)" class="ChangeAttributeName">
                <parameter key="new_name" value="vend"/>
                <parameter key="old_name" value="att1"/>
            </operator>
            <operator name="FeatureNameFilter" class="FeatureNameFilter">
                <parameter key="filter_special_features" value="true"/>
                <parameter key="skip_features_with_name" value="label"/>
            </operator>
        </operator>
        <operator name="AttributeMerge" class="AttributeMerge">
            <parameter key="first_attribute" value="brand"/>
            <parameter key="second_attribute" value="fami"/>
        </operator>
        <operator name="Aggregation" class="Aggregation">
            <parameter key="aggregation_attribute" value="vend"/>
            <parameter key="aggregation_function" value="sum"/>
            <parameter key="group_by_attribute" value="brand_fami"/>
            <parameter key="keep_example_set" value="false"/>
        </operator>
    </operator>

    By the way: we already added multiple aggregations and groups to our todo list...

    Cheers,
    Ingo
Sign In or Register to comment.