Options
Normalization Issue
Hello Rapid Miner Community,
I'm currently working on a clustering model.
I cluster different countries according to certain determinants.
I'm currently working on a clustering model.
I cluster different countries according to certain determinants.
However, the determinants are composed of different factors (example: Determinant: Degree of economic integration is composed of the factors: Trade Freedom and Trading across borders. The determinant transport infrastructure consists only of the factor: LPI Index).
I use the normalization operator to isolate different scale levels.
However, each determinant (degree of economic integration and transport infrastructure) should be equally weighted, since one determinant consists of more indicators than the other, it is overweighted so far.
My question to you is how I should proceed in RapidMiner in order to weight each determinant equally without having to aggregate the individual factors of a determinant.
Thank you for your support and hints.
Best regards, Carlo
0
Best Answer

OptionsTelcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 UnicornYou can also apply attribute weights. Look at all the options you have in terms of operators under Feature Weights. You can use an algorithmic approach or you could also set the weights manually.5
Answers
There are altogether 5 determinants, these are quasi the main categories.
The determinants consist of a different number of factors (subcategory).
I try to illustrate it with two determinants:
 The determinant or main category homogeneity of demand consists of the factors or subcategories purchasing power, market size and article turnover.
To ensure that each subcategory has the same weighting for the respective determinant, I normalize it (since purchasing power is given on a scale of 1 to 10 market size on a scale of 1 to 10 million).In the second step (and this is my problem) I would now like to balance the main categories as well, since one determinant consists of only one factor and the other determinant of three factors, I do not know how to proceed and would be very pleased about your opinions.
I hope I explained it better this time
Not sure if I understood this correctly, but if you have an issue with the number of dimensions (Attributes) per determinant, why not apply dimensionality reduction techniques like PCA?
Thanks
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing