Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
"Split string with n characters into n columns with each cell with only one character"
komal_chenthama
Member Posts: 3 Learner III
I have around 700 rows with strings of varying length from 50 to 2000. They look like this:
MRILTWAITLLSLACFSLTEKYCYYPNGQIAVSDSPCNPNADDSACCDGDKGMMCMSNNLCRGPGGTTVRSSCTDKSWDSTACAALCMTENTVPADLTSCANVTGSDTTYCCDNHRVPCCDASIARFDVLPSKPQIFAIWDDSASAYLSINLPGTATTTATTTTSSSPAYPTDPPPSNTQPSSTPSNPPSPDAASAAALSLAVQAGIGVGAAVLALALAVVVYLVVKLRRNKNAVLAAGQRGQAGAVHGQYQGGVGVGGYDGWENKHMDKNGGGVGNGGGGAAAWYHPPAYGEPYHGGSGFGVVPRQELDAWPSVGYGQPRRQRHRQSHGQGYVQRFELPATPLGAPRRAF |
MKTPLIFLLHLGLLQTCLGKKCYYPGGEEAPGDLPCDTEAEHSPCCAGGKIAGACLANKLCLAKGNPDWYARGSCTDPTFEAPECPKFCLSHEGRGWNLDYCFSQTGSETAFCCEGDANCCAAGRLEIQPAPTHVWALWNGAVSRYDVVTPLGTAKETSAPTSSATSSGTTSDAVEHSSTETTSASTTGTAAGGDRSDATGSANSNSNANSNESTGLSTGAQAGIGVGAAAGALLLAAVAFLWWRMNRMQKAMLVAQQQAAAAYPPPETPAYYSRTPAEKHELMAERPTHELAGQHYYVQGDTRSAELSSQPAYTPVESPAAGRNYGP |
MRSVYIALAAALCWTGTLSASPAGAKDDVEVAMMAGRRRLTRTSGRYRSEFAALGARQGDQQCGAQFGRCPGDLCCSSYGFCGDSVDHCHPLFDCQTQYGTCGWPRAVPTTSARPTTSSTPAPPTTTTPSSTSVRPPTTSTSVTIPVPSGGLEVTQNGMCGNNTMCIGNPNYGPCCSQFFWCGSSIEFCGAGCQSDFGACLGIPGQPGNPITNGTTTSGGGSGPTSSPPTTRPTSTRVSTTTTTTTSSRTTSSSPSVTLPAGQTSSTDGRCGNNVNCLGSRFGRCCSQFGYCGDGDQYCPYIVGCQPQFGYCDPQ |
I would like to split character into n(length of the string in that cell) columns, such that each cell contains only one character. And this should be done for all the rows. Then each letter is to replaced by a specific score (decimal). How can this be achieved? Please help.
Tagged:
0
Answers
Hi @komal_chenthama
It would not be too complicated using Loop operator and Generate Macro + Generate Attributes inside it, where macro would be just a counter of loop iterations and each new attribute would take out substring of length 1 and at position equals the number of current iteration.
But the question is, would it be possible to make all strings of equal length before, with dummy of special characters? As otherwise each example would generate different number of attributes (equals to each string length) and you potentially may end up with an error. And honestly I am afraid I cannot come up with a very quick solution to accomplish it using RapidMiner, at least at the moment.
Vladimir
http://whatthefraud.wtf
Hi @komal_chenthama,
I like this kind of problem !!
The trick, here, is to replace the "no-spaces" by "-" (or in other words, add "-" between the letters) and
then use the Split operator with the "-" pattern :
Does this process answer to your need ?
Regards,
Lionel