Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
How to combine Logistic regression with SOM as a hybrid model?
komeil_shaeri
Member Posts: 13 Contributor II
Hi,
I need to combine Logistic regression with SOM or DBSCAN as a hybrid model. This will be a hybrid "Classification + Clustering" model in which a classifier can be trained first, and its output is used as the input for the cluster to improve the clustering results.
Thanks,
Tagged:
0
Answers
Just take your pre-processed (ETL'd) data, feed it into a X-val with your Logistic Regression, the use an apply model on the outside to to score your training set and put it into the clustering algo. Of course I'm simplifying it, but it should be quite easy to do.
Update: Something like this?
Thanks for your response ...
The problem is when I hybridize the algorithms, the performance measures (accuracy, precision, recall) don't change even if I disable the x-validation operator which contains the logistic regression. I don't know why logistic regression cannot affect the overall performance...
Please see the attached file.
Thanks
Hi,
In this example, first I have applied decision tree (DT) on Titanic data. The resulting accuracy is 80.29%.
When the DT is hybridized with Fuzzy C-means (FCM), still the performance accuracy is 80.29%. This means that the system does not take into account the FCM. Is there another way to integrate the Classification and Clustering models? Can you help me on this issue?
DT process:
DT-FCM process:
Many thanks,
Komeil
I'm a bit confused as to why you want to first classify the data and then segment it? These are two different methods of learning (Supervised and Unsupervised). In the supervised method you start with knowing the truth, you know who died and didn't die in the Titanic disaster. Normally, in the Unsupervised way, you typically don't have a class label and look for statisical characteristics that 'segment' like groups together. In what you are trying to do here is build a model on the Titatinic data set with a label and then throw out that label and segement out the regular attributes. You will get different performance measures for sure, one for a classification problem and the other for a segementation problem.
If you're looking to combine multiple algorithms, have you tried our stacking (ensembing) operator?
Stacked Generalization is good for combining multiple classifiers. I'm wondering if is there any way to combine clustering techniques with each other? I heard about "Consensus Clustering" which is similar to stacking but for clustering methods.
Maybe what you can do is select one class from the Logistic Regression result and then pass that to the clustering process. This way you can segment out those attributes for the single class.