"For each XLS row, calculate similarity among the 3 text cells in that row"

Hi everyone,

I would appreciate if you could share any thoughts on how could I solve the problem below:

INPUT: Excel with multiple rows and 3 columns (say columns A,B and C). All excel content is text

PROBLEM: For each row, calculate similarity among the 3 text cells in that row. Then save the calculated similarities


If Sim(x,y) is the text similarity between any cells 'x' and 'y' in the Excel file, an ideal output would be another excel that follows the format below:

Sim(A1,B1) Sim(A1,C1) Sim(B1,C1)
Sim(A2,B2) Sim(A2,C2) Sim(B2,C2)
Sim(A3,B3) Sim(A3,C3) Sim(B3,C3)
Sim(A4,B4) Sim(A4,C4) Sim(B4,C4)
Sim(A5,B5) Sim(A5,C5) Sim(B5,C5)
Sim(An,Bn) Sim(An,Cn) Sim(Bn,Cn)

I've see a number of Rapidminer videos to learn this task but haven't succeeded yet.

Any ideas? Since I am still learning the basics, I would appreciate if you could tell what the entire process looks like.

Thank you in advance


    the operators you might need is Cross distances. This is calculating the similarity - but usually between documents which are given as examples. So you i think you need to use a Loop and a Transpose (or Depivot?) Operator to get a vertical example set for each round.

    If you could post an example set me or another helper might find time to build an example process.

