Options

"how to calculate the distance between two text documents?"

gfyanggfyang Member Posts: 29 Maven
edited June 2019 in Help
Hi,

Suppose here are two text documents, d1 and d2. I could build two vectors and read them by Iterator<Example>. Then, how to calculate the distance or similarity between them? For example, the cosine distance. Is there any operator or function provided by RM?

Thank you very much.

Sincerely yours,
gfyang

Answers

  • Options
    fischerfischer Member Posts: 439 Maven
    Hi,

    yes. Please look into com.rapidminer.tools.math.similarity.DistanceMeasure

    Cheers,
    Simon
  • Options
    gfyanggfyang Member Posts: 29 Maven
    Hi,

    Thanks a lot for the reply. However, it is still not clear enough for me. Would you please give some Java codes?
    I tried the following, but failed:

    ExampleSet ex=...
    Example ex1 = ex.getExample(1);
    Example ex2 = ex.getExample(2);
    DistanceMeasure myDis = new DistanceMeasure();
    double dis = myDis.calculateDistance(ex1, ex2);
    It reported DistanceMeasure could not be instantiated?

    Thank you.

    Sincerely yours,
    gfyang
  • Options
    fischerfischer Member Posts: 439 Maven
    Hi,

    distance Measure is abstract. You can only instantiate its subclasses.

    Also, if you are using a distance measure at an operator, try installing a DistanceMeasureHelper.

    Cheers,
    Simon
  • Options
    gfyanggfyang Member Posts: 29 Maven
    Hi,

    I see. The subclasses work well. Thank you.

    Sincerely yours,
    gfyang
Sign In or Register to comment.