Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
"Terminology on Data Sampling"
Hi
simple question:
I have a scenario where i use cluster analysis to sample a set of data into different groups of homogenous entities. Then i extract one entity of each group as a representative. What is the terminology on that? I would call it something like Data Sampling, but googling data sampling wasn much successfull... For example "sampling" (wikipedia) seems to be concentrated on investigation on populations and such.
However, searching in this forum i think that sampling might nevertheless what i am looking for.I would appreciate any help on the terminology and also if somebody could advise some literatur on that topic.
greetings
shai
simple question:
I have a scenario where i use cluster analysis to sample a set of data into different groups of homogenous entities. Then i extract one entity of each group as a representative. What is the terminology on that? I would call it something like Data Sampling, but googling data sampling wasn much successfull... For example "sampling" (wikipedia) seems to be concentrated on investigation on populations and such.
However, searching in this forum i think that sampling might nevertheless what i am looking for.I would appreciate any help on the terminology and also if somebody could advise some literatur on that topic.
greetings
shai
Tagged:
0
Answers
http://en.wikipedia.org/wiki/Cluster_sampling
Quota is a subset of Stratified, but it makes sure that the sample proportions are similar to the population proportions of groups.
Population is not the only kind of data that needs to be analysed.. I am a littled bit puzzled about that..
Cheers,
Ingo