ANNOUNCEMENT: WE ARE PROUD TO ANNOUNCE THE LAUNCH OF THE NEW
RAPIDMINER ACADEMY
IT HAS ALL THE SAME TRAINING CONTENT AS HERE PLUS MUCH MORE.
ENJOY AND HAPPY RAPIDMINING!
@sgenzer, Community Manager

Group/Rename Examples

t_liebet_liebe Member Posts: 14 Contributor I
edited December 1 in Help

Hello,

 

I am still preparing my data and would like to rename examples that belong to a same group within one colum:

1Audi, A6677
2March140
3201870
4Dezember51
5Audi, A29
620169
7BMW, 3er7
8BMW, X5

1

 

Later:

1Audi677
2Month140
3Year70
4Month51
5Audi9
6Year9
7BMW7
8BMW

1

 

I know that you might do this with Regular Expressions and also in a different way for date types, but I couldn`t figure it out.

 

Thanks for your help !

Tagged:

Best Answer

  • Telcontar120Telcontar120 Posts: 883   Unicorn
    Solution Accepted

    If you can write a set of logical rules (IF/THEN/AND/OR/NOR) to express the conditions under which these substitutions should occur, then you can accomplish it with "Generate Attributes".  For example, you could use the "contains" function for string searching and then supply supplemental conditions.

    However, if you can't express the renaming subsitutions in a set of rules, and the format of the cell content is not consistent (such that you could use Replace with regex as already suggested), then I am not sure how you would expect a computer program to execute your desires?  You might need to use Map and do it "manually" instead.

     

     

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts

Answers

  • rfuentealbarfuentealba Moderator, RapidMiner Certified Analyst, Member Posts: 215   Unicorn

    Hi, @t_liebe

     

    This is easy:

     

    Use the Replace operator and put the following on the Parameters panel:

     

    • attribute_filter_type: single.
    • attribute: the column name you want to consider.
    • replace what: (.*),(.*)
    • replace by $1.

    Basically, you are trying to read two groups (.*) and (.*), separated by a comma, and use the first one ($1) as the output of your data.

     

    All the best,

     

    Rodrigo.

    sgenzer
  • t_liebet_liebe Member Posts: 14 Contributor I

    Yes I tried that as well, but didn't fit my problem. I think I didn't give you enough information, sorry for that.

    This what it looks like:

    xxxx xxxx Audi xxx

    Audixxx, xxxx, xxxx

    Audixxx

    xxxAudi

     

    So the "Audi" is just a part of the example, but I am not interested in the other parts of the example.

     

    I hope this explains a little bit more.

Sign In or Register to comment.