RapidMiner

Newbie levietduc
Newbie

Feature Selection with multidimensional attributes

Given a dataset with various attribute such as: {a1, a2, a3, b1, b2, c1, c2, c3, c4, d1}
The output of a standard Feature Selection such as Forward Selection will be a subset such as: {a1, b1, c2, c3}
However, I want to have Feature Selection applied on grouped attributes a (3 dimensions), b (2 dimensions), c (3 dimensions), d (1 dimension). That means expected output should be a subset of {a,b,c,d}, for instance, {a, c} = {a1, a2, a3, c1, c2, c3, c4}.
How can I do such Feature Selection with Rapid Miner?

3 REPLIES
RM Certified Expert
RM Certified Expert

Re: Feature Selection with multidimensional attributes

I would investigate using some sort of Loop. Perhaps Loop Subsets for this. 

RM Staff
RM Staff

Re: Feature Selection with multidimensional attributes

You can get your hands dirty inside the forward selection operator. For example, with the following rules:

 

if a1 is in attribute set,

then make sure that a1, . . ., an are in the dataset

else 

make sure that a1, . . ., an are NOT in the dataset

 

if b1 is in attribute set,

then make sure that b1, . . ., bn are in the dataset

else 

make sure that b1, . . ., bn are NOT in the dataset

 

etc.

 

You will be wasting a lot of computation, but it is a workaround that may work.

Guru
Guru

Re: Feature Selection with multidimensional attributes

What you are describing sounds a lot like grouped Lasso. I don't think it is available directly in Rapidminer, but it is indirectly by using the R-script extension. 

 

Check out the library grpreg in R. I'm sure there are other libraries that will perform grouped lasso, but this is the one I know

 

https://cran.r-project.org/web/packages/grpreg/grpreg.pdf

Twitter Feed