Options

SimilarityMeasure initialization problem

siamak_wantsiamak_want Member Posts: 98 Contributor II
edited November 2018 in Help
HI all,

I have a dataset with an id column of type "text" named "att24754".
I have a problem with initialization of Cosine similarity measure. Calling the following code:

clusteredSet=clusteredSetInput.getData(ExampleSet.class);
similarityMeasure= new CosineSimilarity();
similarityMeasure.init(clusteredSet);
Now the problem is come up in calling init "method":

"The example set contains non-numerical attribute att24754 which is not allowed for value based similarities."

and the strange thing is that "att24754" has set to the "id" of my exampleset. So I expect it to not be involved in similarity calculation as an "special attribute". But it will be involved in the similarity computation and so RM generates the error. I know all of the attributes should be numerical except my id. Does anyone know what's going wrong here?

Any idea would be greatly appreciated.

thanks.

Tagged:

Answers

  • Options
    SkirzynskiSkirzynski Member Posts: 164 Maven
    Indeed this should not happen. Please make sure that "att24754" is really a special attribute. If your example set displays in the result view, take a look at the "Meta Data View". Any special attribute will be displayed with a red background.

    You may also check your attributes in your source code. Basically the init method iterates over the attributes object of the example set which returns only regular attributes.

    for (Attribute attribute : exampleSet.getAttributes()) {
    System.out.println(attribute.getName());
    }
Sign In or Register to comment.