AttrbuteCreation IF Syntax

MuehliManMuehliMan Member Posts: 85 Maven
edited November 2018 in Help
Hi,

I tried to make an Operator that creates a new Attribute (e.g. CLASS) that is "ACTIVE" for entries with an (numerical) activity higher than 50 and "INACTIVE" for entries with an activity below 50.
My first attempt was using the AttributeConstruction Operator, but I couldn't figre out how I make a function that distinguishes between actives and inactives.

Can anybody help me with this?

Thanks in advance.
Markus

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Markus,
    the AttributeConstruction Operator is the right tool to achieve this. Here's how it works:
    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
            <parameter key="number_of_attributes" value="1"/>
            <parameter key="target_function" value="simple sinus"/>
        </operator>
        <operator name="AttributeConstruction" class="AttributeConstruction">
            <list key="function_descriptions">
              <parameter key="target" value="if(att1 &gt; 5, &quot;true&quot;, &quot;false&quot;)"/>
            </list>
        </operator>
    </operator>
    Greetings,
      Sebastian
  • MuehliManMuehliMan Member Posts: 85 Maven
    Hello again!

    That works great for an active/inactive switch. But what if I want more than two categories. For example I want to distinguish between low, medium, heavy and very heavy depending on the activity. So I have boundaries for each category. Do I have to nest some if clauses, is there a way to program multiple switches or is there a special operator that does the classification for me.

    As always thx in advance.
    Markus
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Markus,
    take a look at the UserBasedDiscretization. This should solve your problems...

    Greetings,
      Sebastian
  • MuehliManMuehliMan Member Posts: 85 Maven
    Hi Sebastian,

    So I figures out that this code will do the classification I want:
        <operator name="Classify data" class="OperatorChain" expanded="yes">
            <operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
                <parameter key="attribute_name_regex" value="AA"/>
                <parameter key="condition_class" value="attribute_name_filter"/>
                <parameter key="deliver_inner_results" value="true"/>
                <parameter key="parameter_string" value="CSA"/>
                <operator name="UserBasedDiscretization" class="UserBasedDiscretization" breakpoints="after">
                    <list key="classes">
                      <parameter key="low" value="4.5"/>
                      <parameter key="medium" value="7.5"/>
                    </list>
                    <parameter key="return_preprocessing_model" value="true"/>
                </operator>
            </operator>
    I put a AttributeSubsetPreprocessing Operator in front, because I don't want to cluster all (numeric) columns. This works fine, but actually it removes all other column although the option for "keep subset only" is not checked.
    So how do I classify according to one column and just change this one column without removing all others?

    Best wishes,
    Markus
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Markus,
    it doesn't do this. Since there was a breakpoint within the AttributeSubsetPreprocessing, I assume, that you have look at the data before the AttributeSubsetPreprocessing finished and merged the examplesets again.

    Greetings,
      Sebastian
  • MuehliManMuehliMan Member Posts: 85 Maven
    Thanks, I could have figured out that by myself!

    Markus
  • MuehliManMuehliMan Member Posts: 85 Maven
    Now I am almost there where I want to be. I am now clustering my data into different cluster by a user-defined criterion. What I finally wan is to perform a Descicion tree or Rule learner and cluster according to these rules.
    Again I searched for an appropriate Operator but I couldn't find it.

    Best wishes,
    Markus
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Markus,
    each learner needs a labeled example set. If your new attribute should be the label, use ChangeAttributeRole Operator to define this attribute as label. This is then the target of the ongoing analysis.

    Greetings,
      Sebastian
  • MuehliManMuehliMan Member Posts: 85 Maven
    But a decision tree does not create an output column similar to a cluster algorithm, isn't it? I want that the Decision tree assignes a id for each group of examples according to the found rules.
    Cubist (by Rulequest) is doing something like this (at least as far as I understand) and I want to implement this with RM.

    Best wishes,
    Markus
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    each learner produces a model. You then might apply this model onto an exampleset by using trara...the "ModelApplier".
    Since it doesn't make much sense applying a model on the training data, this isn't done automatically.

    Greetings,
      Sebastian
  • MuehliManMuehliMan Member Posts: 85 Maven
    Works fine, thanks for all the highly appreciated help!

    Best wishes,
    Markus
Sign In or Register to comment.