Levenshtein Distance

tmyerstmyers Member Posts: 21 Contributor I
edited December 2018 in Help

So I have this file in my old version of RapidMiner (5.3):

 

C:\Program Files (x86)\Rapid-I\RapidMiner5\src\com\rapidminer\tools\math\similarity\nominal\LevenshteinDistance.java

 

When I open RapidMiner, there is no such operator or extension available, even though it exists in the RapidMiner5\ directory.

 

How do I access this package and/or operator to use Levenshtein Distance?

 

Thanks in advance for any help,

 

Tim

 

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761   Unicorn

    I'd ping @mschmitz here because he had a hand in creating that java class. I know it was developed for v7+ and it's probably not comptabible with v5.3. 

  • tmyerstmyers Member Posts: 21 Contributor I

    Thomas, no, no that's the thing, IT IS THERE in my 5.3 install by default (i.e. I didn't add it). I saw the new one you're referencing and that's not the one I'm asking about.

     

    Here is an old github link to it if it is helpful:

    https://github.com/rapidminer/rapidminer-5/blob/master/src/com/rapidminer/tools/math/similarity/nominal/LevenshteinDistance.java

     

    Since it is there, I'm simply trying to figure out how to use it.

     

     

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,107  RM Data Scientist

    Dear tmyers,

     

    as far as i know this could only be used in an execute script.

     

    Best,

    Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • Edin_KlapicEdin_Klapic Moderator, Employee, RMResearcher, Member Posts: 266  RM Data Scientist

    Hi tmyers,

     

    I don't know if this really helps your problem but there is now a RapidMiner Extension available called "Operator Toolbox". This contains an Operator named "Generate Levenshtein Distance". Perhaps this helps in your use case...

     

    Best,

    Edin

  • tmyerstmyers Member Posts: 21 Contributor I

    Thanks Martin. I've never used the Execute Script operator so I'm unsure how to do this. I just tried it by copying this text from the LevenshteinDistance.java file into the Edit Text parameter of the operator:

    package com.rapidminer.tools.math.similarity.nominal;
    /**
    * This calculates the levenshtein distance of two strings. This is not a valid distance measure.
    *
    * TODO: Extend this to become a valid distance measure
    *
    * @author Sebastian Land
    */
    public class LevenshteinDistance {

    public static int getDistance(String value1, String value2, int substitutionCost) {
    byte[] s = value1.getBytes();
    byte[] t = value2.getBytes();
    int n = s.length + 1;
    int m = t.length + 1;
    int[][] d = new int[n][m];

    for (int i = 0; i < n; i++)
    d[i][0] = i;
    for (int j = 0; j < m; j++)
    d[0][j] = j;
    for (int i = 1; i < n; i++) {
    for (int j = 1; j < m; j++) {
    int cost = (s[i - 1] == t[j - 1]) ? 0 : substitutionCost;
    d[i][j] = Math.min(Math.min(d[i-1][j] + 1, d[i][j - 1] + 1), d[i - 1][j - 1] + cost);
    }
    }
    return d[n - 1][m - 1];
    }
    }

    When I tried running it I got the following error message:

     

    image.png

     

    Am I trying this the right way? When I look at the code I presume I have to define attributes to use for "String value1" & "String value2".......?

     

    Any advice you can offer on this Execute Script opertaor would be much appreciated!

     

    Thanks,

     

    Tim

  • tmyerstmyers Member Posts: 21 Contributor I

    Hi Edin. Yes, I am aware of the Operator Toolbox. Do you know if it works with RapidMiner version 7.1? I believe the marketplace indicates 7.3+ in terms of its compatibility.

     

    Thanks,

     

    Tim

     

     

  • tftemmetftemme Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, Member Posts: 103  RM Research

    Hi Tim,

     

    Unfortunately the Operator Toolbox Extension is not compatible with 7.1. There were some API changes introduced in 7.3. Those changes are used in the Operator Toolbox. 

     

    Would it be possble for you to update to 7.3?

     

    Best regards,

    Fabian

Sign In or Register to comment.