"declared id attributes used in clustering?"

peppep Member Posts: 7 Contributor II
edited May 2019 in Help
Hi, can anyone help with the following questions pls?
Is a (numeric) attribute whose role is declared to be id, used by default by the software in building clusters (by the attribute participating in the computation of distances. etc)? What about building a supervised learning model  as a:
- decision tree - does the implemented algorithm compute by default the gain ratio for an id attribute?
- naive bayes classifiers - does the algorithm compute conditional probabilities (and implicitly sample means and standard deviations) in the case of the declared id attribute?

cheers
Tagged:

Answers

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    whose role is declared to be id, used by default by the software in building clusters (by the attribute participating in the computation of distances. etc)?
    no, in general, attributes with the role "id" are only used for identification purposes like in the plotters but never for data mining schemes. For modeling, usually only the regular attributes (i.e. no specific role), the label, and sometimes the weight are used.

    What about building a supervised learning model  as a:
    Same here as for clustering.

    - decision tree - does the implemented algorithm compute by default the gain ratio for an id attribute?
    No, this will not happen.

    - naive bayes classifiers - does the algorithm compute conditional probabilities (and implicitly sample means and standard deviations) in the case of the declared id attribute?
    Dito.

    Cheers,
    Ingo
Sign In or Register to comment.