Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

[Solved] R script loses data (kMed -

SmergSmerg Member Posts: 4 Contributor I
edited November 2019 in Help
Hello people,

I'm desperate. For weeks I've been working on the process. After several remodeling and many attempts, it still does not work.  :'( In my stand alone version, everything works fine. In my whole process with loops, an error appears: "The attribute "hungarian cluster" does not exist.... ", because my R script loses the data. So it's clear, that the following operator can not find an attribute.

My if statement in R intervenes. The error emerges in the fourth pass of inner loop and generate this error. The log says: "PM INFO: Hungarian Algorithm: Fehler in minWeightBipartiteMatching(clusterA, clusterB) : number of cluster or number of instances do not match"
Why it works the first three repetitions?! I hope you can help me!




Download
Error version of my process: https://dl.dropbox.com/u/5861880/Error_version.xml
Working minimal version without loops: https://dl.dropbox.com/u/5861880/Working_version.xml

Example set
I use the wine dataset from UCI: http://archive.ics.uci.edu/ml/datasets/Wine.



What do I want?
  • 10 splits of D to Dtraining and Dtest. I repeat it for k = 3 to 20
  • For every split, I choose ten times k examples for every predefined cluster.
  • For every inner loop, I run k-medois and relabel the training data with the hungarian method in R. For this purpose I use the k previously selected examples per cluster and optimize they. This bases on mapping computated cluster and predefined cluster.
  • Afterwards I measure the performance of my SVM-Model with Dtest
At the end, I have 17(k)*10(splits)*10(iterations) perfomance vectors.
Tagged:
Sign In or Register to comment.