Question re probability of class membership

2wheels_good · September 2023

Thanks again rjones13 for previous answer.

Setting up my text classification process and I have a question. I want to classify a set of online comments (examples) into two classes (call them online behaviour A and online behaviour b. My plan is to put the classified comments into a growth curve model to see how the frequency of behaviour A and behaviour b change over time. It occurs to me that if behaviour A diminishes and behaviour b increases over time (as hypothesized), then there will be many instances of comments that exhibit aspects of both behaviours. My thinking is that if I calculate probability of class membership (in A and/or b) for each example comment, I will capture the instances of examples that fall into both classes and I can then use a cutoff to use to select comments for the growth curve model). My question is this: when developing the training set, do I classify comments that I consider either A or b only (and let Rapidminer assign class membership percentage for all comments on this basis) OR should I also classify training example comments I consider as belonging to both A and b as I develop the training set? I am assuming a binomial classification (A or b) but wonder if I need multi-label classification with a third class representing a blend of A and b with blended comments identified in the training set. Appreciate any insight you can provide.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Question re probability of class membership