Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Loop Cluster - exclude examples
Hi,
I have used Rapidminer to create many files with clusters. Therefore, my data file has many examples with a set of numeric attributes, a label, and a special attribute - cluster attribute. Next, I am trying to use (Loop Clusters) process to run classification models within validation process which is primarily working for the data files I have. However, some data files have clusters with one or 2 examples or observations; therefore an error is generated whenever the loop reaches any of the clusters with small number of examples. My question is how can I exclude clusters with low number of examples in the Loop Clusters process.
Thanks
I have used Rapidminer to create many files with clusters. Therefore, my data file has many examples with a set of numeric attributes, a label, and a special attribute - cluster attribute. Next, I am trying to use (Loop Clusters) process to run classification models within validation process which is primarily working for the data files I have. However, some data files have clusters with one or 2 examples or observations; therefore an error is generated whenever the loop reaches any of the clusters with small number of examples. My question is how can I exclude clusters with low number of examples in the Loop Clusters process.
Thanks
Tagged:
0
Answers
you can extract the size of your cluster subset with the Extract Macro operator and the option "number of examples", next you use a branch operator, set as condition that the number of examples has to be greater two, and perform your model building in the Then-branch. The Else-branch can return an empty example set or some dummy data, whatever fits best.
Best,
David