My goal is to compare the similarity of 100 DNA strings with string matching algorithms, such as the Smith Waterman algorithm. In the RapidMiner, there is only the Levenshtein Distance algorithm. Help me out which operators to compare 100 strings together and then put the results as confusion in the matrix?
i think so far we only got Levenshtein for string distance. If you strings are of equal size you could use Split + Cross Distance to do some other metrics.
Which String-Distances would you like to get in?