Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
"Question about Clustering Data before running a model"
Hi i have a problem with a biologic data and here i explain it with a simple example
I have 10 proteins that every two protein belong to one organism
for example protein 1&2 belongs to human 3 &4 belong to mouse and so on
I have Five organism which consist my label and my goal is making a model to predicts these five organism
but the problem is when i run this data every proteins is analyzed independently and the final result consist of 10 proteins which belongs to 5 organism while i every two proteins are linked together and they should be analyzed together .....what i want is every two proteins with same organism get into one group and then i get 5 groups which are classified by the organism of my protein get analyzed by model
i wanna know is there any way to cluster these proteins and similar data ?
I
I have 10 proteins that every two protein belong to one organism
for example protein 1&2 belongs to human 3 &4 belong to mouse and so on
I have Five organism which consist my label and my goal is making a model to predicts these five organism
but the problem is when i run this data every proteins is analyzed independently and the final result consist of 10 proteins which belongs to 5 organism while i every two proteins are linked together and they should be analyzed together .....what i want is every two proteins with same organism get into one group and then i get 5 groups which are classified by the organism of my protein get analyzed by model
i wanna know is there any way to cluster these proteins and similar data ?
I
Tagged:
0
Answers
can you please post your table structure with one or two rows of example data, and the desired outcome?
andcanyoupleaseusedotsandlinebreaksotherwiseyourpostsareprettyhardtounderstand.
Best,
Marius
And What i need is samples in a same cluster be analyzed together
You can do so by installing the Series extension and using the Windowing operator. Set both window_size and step_size to 2, because you always have 2 lines which belong together.
Maybe you have to add a Select Attributes or some Rename and Set Role operators after the Windowing operator, but that should be pretty straight forward.
Does that operator do what you need?
Best, Marius
and my second question what if my number of rows is not always 2?
cant it be done by using set role operator and selecting batch or cluster role for cluster column ?