How to construct the corss-product/cartesian-product attribute in RapidMiner

yuyu Member Posts: 3 Contributor I
edited November 2018 in Help
I am trying to construct the cross product attribute in RapidMiner, but haven't find the way yet.

If attributes A1 (level=1,2 or 3) and A2(contribution=1,2,3, or 4) are known, now want to construct the new attribute A3 = A1 X A2 ,  eg. one new attribute will be (level=2,contribution=3),

Which operator should I use? the AttributeConstruction, AttributeMerge or other?  I fail to see the corss product formula available  in AttributeConstruciton, and as the cross-product consider the vector or space position as I see, so the simply merge two attributes may not be appropriate either.

So how can I build the new attributes, appreciate your suggestion.

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    currently it's not clear to me, what you are going to do. Do you wish to concatenate nominal attributes? Then use the attribute construction.
    Do you wish to multiply a fixed set of numerical attributes? Then use the attribute construction, and build the cross product.
    There's no crossproduct operator available in RapidMiner, because you would have to specify either way, which attributes are part of either the first or the second vector. You could simply type it into the AttributeConstruction operator.

    Greetings,
      Sebastian
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    well, it is rather likely that you are you searching the operator "Cartesian" (surprise!  ;) ). Here is a sample process (for RM 5):

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input>
          <location/>
        </input>
        <output>
          <location/>
          <location/>
        </output>
        <macros/>
      </context>
      <operator activated="true" class="process" expanded="true" name="Process">
        <process expanded="true" height="476" width="681">
          <operator activated="true" class="generate_data" expanded="true" height="60" name="Generate Data" width="90" x="45" y="120">
            <parameter key="number_examples" value="5"/>
            <parameter key="number_of_attributes" value="1"/>
          </operator>
          <operator activated="true" class="rename" expanded="true" height="76" name="Rename (2)" width="90" x="179" y="120">
            <parameter key="old_name" value="att1"/>
            <parameter key="new_name" value="A1"/>
          </operator>
          <operator activated="true" class="generate_id" expanded="true" height="76" name="Generate ID" width="90" x="313" y="120"/>
          <operator activated="true" class="select_attributes" expanded="true" height="76" name="Select Attributes (2)" width="90" x="447" y="120">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="label"/>
            <parameter key="invert_selection" value="true"/>
            <parameter key="include_special_attributes" value="true"/>
          </operator>
          <operator activated="true" class="generate_data" expanded="true" height="60" name="Generate Data (2)" width="90" x="45" y="300">
            <parameter key="number_examples" value="3"/>
            <parameter key="number_of_attributes" value="1"/>
          </operator>
          <operator activated="true" class="rename" expanded="true" height="76" name="Rename" width="90" x="179" y="300">
            <parameter key="old_name" value="att1"/>
            <parameter key="new_name" value="A2"/>
          </operator>
          <operator activated="true" class="generate_id" expanded="true" height="76" name="Generate ID (2)" width="90" x="313" y="300"/>
          <operator activated="true" class="select_attributes" expanded="true" height="76" name="Select Attributes" width="90" x="447" y="300">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="label"/>
            <parameter key="invert_selection" value="true"/>
            <parameter key="include_special_attributes" value="true"/>
          </operator>
          <operator activated="true" class="cartesian_product" expanded="true" height="76" name="Cartesian" width="90" x="581" y="210"/>
          <connect from_op="Generate Data" from_port="output" to_op="Rename (2)" to_port="example set input"/>
          <connect from_op="Rename (2)" from_port="example set output" to_op="Generate ID" to_port="example set input"/>
          <connect from_op="Generate ID" from_port="example set output" to_op="Select Attributes (2)" to_port="example set input"/>
          <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Cartesian" to_port="left"/>
          <connect from_op="Generate Data (2)" from_port="output" to_op="Rename" to_port="example set input"/>
          <connect from_op="Rename" from_port="example set output" to_op="Generate ID (2)" to_port="example set input"/>
          <connect from_op="Generate ID (2)" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
          <connect from_op="Select Attributes" from_port="example set output" to_op="Cartesian" to_port="right"/>
          <connect from_op="Cartesian" from_port="join" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="180"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Hope that gives you the idea.

    Cheers,
    Ingo
  • yuyu Member Posts: 3 Contributor I
    Hi, Ingo and Sebastian, thanks for replying and showing the Cartesian sample process!

    em, so, can I simply use the nominal attribute to replace the Cartesian product.

    eg. can I simply use the nominal attribute A3 {11,12,13,14,21,22,23,24,31,32,33,34} to represent the Cartesian product A3=A1 X A2? where A1 (level=1,2 or 3) and A2(contribution=1,2,3, or 4).

    Or I cannot, the two ways are not equal?
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    depends on what you are going to do, I guess. You might do this, but you get one single attribute with a huge number of nominal values. If that's sensible depends on what you need the data for.

    Greetings,
      Sebastian
  • yuyu Member Posts: 3 Contributor I
    Hello, Sabastian. I am practicing one classifying test for nominal prediction, using NaiveBayes classifier to predict the final attribute is either 0 or 1. And the cross product A3 mentioned above is one important attribute for this prediction.

    Do you think by using the nominal attribute A3 {11,12,13,14,21,22,23,24,31,32,33,34} is suitable?

    Will this set considers the "space density feature" or not? I mean like, put on the 2D graph, x axis - A1, y axis - A2, if the final attribute turn to be "1" more often in the point A3 {11}, so will the algorithm considering
    the neighboring points {12},{22},{21} will also likely influence the final attribute result to be "1"?

    14 24 34
    13 23 33
    12 22 32
    11 21 31

    If there is no such "space density feature" considered in the cross product construction, so I think we could simply use the A3 {11,12,13,14,21,22,23,24,31,32,33,34}.

    I hope my question explanation is bit more clearly.
Sign In or Register to comment.