How can I extract only the "year" from a Date column to a new column called "YEAR" ?

Gonfiaf_ZuraikGonfiaf_Zuraik Member Posts: 9 Contributor II

Hello dear people,

 

I am dealing with a data set that has a "Start date" and "End date" columns in it that has the format (Date/Time e.g. Jan 1, 2016 11:00 AM). I am interested in analyzing Yearly data, so I am interested in generating a new attribute called "Year", and to include in this attribute only the Year (e.g. 2016 only)

an example of the data:example.JPG

 

@sgenzer @mschmitz @stevefarr @Pavithra_Rao  guys I tagged you according to the instructions I passed through before posting my question, where it was recommending me to tag people asking for help. Thank you :) 

How can I do this? any help or advice will help me

 

Much obliged,

Jana

Best Answers

  • jczogallajczogalla Employee, Member Posts: 144 RM Engineering
    Solution Accepted

    Hi @Gonfiaf_Zuraik!
    You can use the Generate Attributes operator with your new attribute name and the expression

    date_get([Start date], DATE_UNIT_YEAR)

    as generation function. That should do the trick.

    Notice that "Start date" is in brackets, because the attribute name has a space in it.

     

    Cheers

    Jan

  • David_ADavid_A Administrator, Moderator, Employee, RMResearcher, Member Posts: 297 RM Research
    Solution Accepted

    Hi,

     

    you can use the date_get() method from the expression builder, in the  Date Calculation group. It can extract various units out of a date attribute (the units can be found below in Constants section). This can be done either within TurboPrep or with the Generate Attributes Operator in a process. In your case the expression would look like that:

     

    • date_get([ChurnDate],DATE_UNIT_YEAR)

    Best,

    David

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Solution Accepted

    Hi,

     

    It looks like you are using Turbo Prep to work on the data as well.  There, you can also simply go to "Transform", "Change Type", "to number", and select "year" in the parameters.  Press "Apply" and the column will be changed to only contain the year.  If you need the original data column as well, just make a "copy" of the column (also in "Transform") before you change the type.

     

    Hope this helps,

    Ingo

Answers

  • Gonfiaf_ZuraikGonfiaf_Zuraik Member Posts: 9 Contributor II

    Oh thank you my dear @jczogalla

    it has just worked out ! :D , I just wanted to add a small thing (just in case in the future someone passes through this question)

    that just need to add a right parenthesis after DATE_UNIT_YEAR


    date_get([Start date], DATE_UNIT_YEAR)

     

    Thank you so much for your quick response Im grateful

     

    Cheers :)

    Jana

  • jczogallajczogalla Employee, Member Posts: 144 RM Engineering

    Glad I could help. And thanks for pointing out the missing parenthesis! Edited my answer. :)

  • Gonfiaf_ZuraikGonfiaf_Zuraik Member Posts: 9 Contributor II

    Dearest @David_A

    Thank you so much! 

     

    Thank you all guys, im flooded with you kindness and helpfulness ! @David_A @jczogalla

     

    Please excuse my new experience and thank you for understanding 

     

    Bests always!

    Jana

Sign In or Register to comment.