Use attribute values as attribute names

achint_kumarachint_kumar Member Posts: 16 Contributor II
Hi All,

I am trying to automate one of the processes. Just like we can use macros as attribute names, is it possible to use attribute values as attribute names? Any expression for that?
For example: I am creating few Quarter & year concatenations dynamically based on dates for each row.
These are like:                  Q1_Y1, Q2_Y2,...................Qn_Yn
The values of these are:  Q3_19, Q4_19.......................etc

Now, these values Q3_19, Q4_19 stored in Q1_Y1 Q2_Y2 are actually column names from the example dataset which i need to use for doing some filters dynamically. 

Is it possible to use Q1_Y1 value as an attribute name using some expression? I tried using attribute(Q1_Y1) but it doesnt workout.
Please help.

Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Hi @achint_kumar,

    I think you could use Rename by Example Values operator...

    Hope this helps,

    Regards,

    Lionel
  • achint_kumarachint_kumar Member Posts: 16 Contributor II
    edited May 2020
    Hi, Lionel,

    Thanks for your response. :)

    No, the above doesn't solve my purpose. The above operator we use for setting any row as column, but i need to pick the values from Qn_Yn variables and use that as Variable name for multiple different patientID (different rows).

    The way we use macros as attributes, i neeed to use the attribute (Qn_Yn) values as attribute name since i already have Q1_19 etc columns in the datasets which i need to use on the basis of these values in variable (Qn_Yn). 

    I hope i'm not vague in my explanation.


  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist
    Hi,
    can you maybe provide 2 excel files with dummy examples? I think I am lost here.

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • achint_kumarachint_kumar Member Posts: 16 Contributor II
    edited May 2020
    @mschmitz : Hi Thanks for your reply, sure please find below information:

    Table 1: Contains 1, 0 as values for Quarter_Year variables. This i am fetching from the DB



    In the same dataset, i have created new attributes which contains the actual Quarter_Year i need to use for my analysis. These i have created dynamically based on a date:


    Based on the above values of variable , i need to use these values as variables so that i can fetch the 0.1 related to the respective Quarter_Year.

    Hope this clarifies.

    Thank you.




  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager
    hi @achint_kumar hmm I have spent a few minutes trying to recreate what you describe here...I am still puzzled. The part I'm missing is this variable. I think (?) you have an additional column in Table 1 that allows you to reference a row? Otherwise if you want to pull values from Table 1, I don't know which one you want.

    If you want to grab attribute names as macros, you can do this in a couple of ways:

    1. You can use Loop Attributes. There is a parameter that sets a macro to the name of the attribute called "loop_attribute" for each loop iteration. You can do whatever you want (like filtering) inside.

    2. You can transpose the ExampleSet so the attribute names are now a column, and use Extract Macro for whichever ones you want.

    Does this help? I rebuilt your examples but could not get any further...

    <?xml version="1.0" encoding="UTF-8"?><process version="9.6.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.6.000" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="utility:create_exampleset" compatibility="9.6.000" expanded="true" height="68" name="Create ExampleSet" width="90" x="648" y="34">
            <parameter key="generator_type" value="comma separated text"/>
            <parameter key="number_of_examples" value="100"/>
            <parameter key="use_stepsize" value="false"/>
            <list key="function_descriptions"/>
            <parameter key="add_id_attribute" value="false"/>
            <list key="numeric_series_configuration"/>
            <list key="date_series_configuration"/>
            <list key="date_series_configuration (interval)"/>
            <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="input_csv_text" value="Q1_19,Q2_19,Q3_19,Q4_19,Q1_20,Q2_20,Q3_20,variable&#10;1,1,1,1,1,0,0,a&#10;1,1,1,1,1,0,0,b&#10;1,1,1,1,1,0,0,c&#10;1,1,1,1,1,0,0,d&#10;1,1,1,1,1,0,0,e&#10;1,1,1,1,1,0,0,f&#10;1,0,1,1,0,0,0,g&#10;1,1,1,1,1,0,0,h&#10;1,1,1,1,1,0,0,i&#10;1,1,1,1,1,0,0,j&#10;1,1,1,1,1,0,0,k"/>
            <parameter key="column_separator" value=","/>
            <parameter key="parse_all_as_nominal" value="false"/>
            <parameter key="decimal_point_character" value="."/>
            <parameter key="trim_attribute_names" value="true"/>
          </operator>
          <operator activated="true" class="utility:create_exampleset" compatibility="9.6.000" expanded="true" height="68" name="Create ExampleSet (2)" width="90" x="648" y="136">
            <parameter key="generator_type" value="comma separated text"/>
            <parameter key="number_of_examples" value="100"/>
            <parameter key="use_stepsize" value="false"/>
            <list key="function_descriptions"/>
            <parameter key="add_id_attribute" value="false"/>
            <list key="numeric_series_configuration"/>
            <list key="date_series_configuration"/>
            <list key="date_series_configuration (interval)"/>
            <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="input_csv_text" value="Q1S_Y1S,Q2S_Y2S,Q3S_Y3S,Q4S_Y4S&#10;Q4_17,Q3_17,Q2_17,Q1_17&#10;Q4_17,Q3_17,Q2_17,Q1_17&#10;Q4_17,Q3_17,Q2_17,Q1_17&#10;Q4_17,Q3_17,Q2_17,Q1_17&#10;Q1_18,Q4_17,Q3_17,Q2_17&#10;Q4_17,Q3_17,Q2_17,Q1_17&#10;Q1_18,Q4_17,Q3_17,Q2_17&#10;Q4_17,Q3_17,Q2_17,Q1_17&#10;Q4_17,Q3_17,Q2_17,Q1_17"/>
            <parameter key="column_separator" value=","/>
            <parameter key="parse_all_as_nominal" value="false"/>
            <parameter key="decimal_point_character" value="."/>
            <parameter key="trim_attribute_names" value="true"/>
          </operator>
          <connect from_op="Create ExampleSet" from_port="output" to_port="result 1"/>
          <connect from_op="Create ExampleSet (2)" from_port="output" to_port="result 2"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
        </process>
      </operator>
    </process>
    




    Scott
  • achint_kumarachint_kumar Member Posts: 16 Contributor II
    edited May 2020
    Hi @sgenzer Thanks for taking out time on this.

    The same data set has these columns from the two tables screenshots as attached. Individual rows are identified using patient ID as a column. The columns having values as 0,1 are used to filter-out few rows(records) as a business requirement. 



    We have a date using which we are creating Lookback quarters_Year and lookforward Quarter year.
    Suppose we have:
    Date: 01/04/2020 

    We create:
    Lookback 4 quarters: Q1_20, Q4_19,Q3_19,Q2_19.
    Lookforward 4 quarters: Q3_20,Q4_20,Q1_21,Q2_21

    The above 4 quarters each are stored in Generated attributes Q1_YY1, Q2_YY2, Q3_YY3, Q4_YY4, etc using a script for creating these timely quarters with Year(Q1_19). 



    The intention is to do the business requirement calculation using Q1_YY1, Q2_YY2, Q3_YY3, Q4_YY4 variables as static. The values of these variables will be different for each patientsIDs per Date and we don't require to do any changes.



    Now we are filtering out few records based on number of 1's in the columns already there i, dataset with 1,0 values (as below).
    I need to pick the value from Q1_YY1, Q2_YY2, Q3_YY3, Q4_YY4 as Q1_20, Q4_19,Q3_19,Q2_19 and read these values as attribute names using a script/expressionGenerate attribute so that i can do the mathematical filter using 1,0 values from the required quarter.

    If this is not possible, is there another way to automate all this just using Date attribute? Looking forward to hearing from you. Thanks again for all your time. :)
      
  • achint_kumarachint_kumar Member Posts: 16 Contributor II
    Do we have a solution for the above case? Thank you.
Sign In or Register to comment.