Feature Selection with multidimensional attributes

levietduclevietduc Member, University Professor Posts: 2 University Professor
edited July 2019 in Help

Given a dataset with various attribute such as: {a1, a2, a3, b1, b2, c1, c2, c3, c4, d1}
The output of a standard Feature Selection such as Forward Selection will be a subset such as: {a1, b1, c2, c3}
However, I want to have Feature Selection applied on grouped attributes a (3 dimensions), b (2 dimensions), c (3 dimensions), d (1 dimension). That means expected output should be a subset of {a,b,c,d}, for instance, {a, c} = {a1, a2, a3, c1, c2, c3, c4}.
How can I do such Feature Selection with Rapid Miner?

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    I would investigate using some sort of Loop. Perhaps Loop Subsets for this. 

  • SGolbertSGolbert RapidMiner Certified Analyst, Member Posts: 344 Unicorn

    You can get your hands dirty inside the forward selection operator. For example, with the following rules:

     

    if a1 is in attribute set,

    then make sure that a1, . . ., an are in the dataset

    else 

    make sure that a1, . . ., an are NOT in the dataset

     

    if b1 is in attribute set,

    then make sure that b1, . . ., bn are in the dataset

    else 

    make sure that b1, . . ., bn are NOT in the dataset

     

    etc.

     

    You will be wasting a lot of computation, but it is a workaround that may work.

  • earmijoearmijo Member Posts: 270 Unicorn

    What you are describing sounds a lot like grouped Lasso. I don't think it is available directly in Rapidminer, but it is indirectly by using the R-script extension. 

     

    Check out the library grpreg in R. I'm sure there are other libraries that will perform grouped lasso, but this is the one I know

     

    https://cran.r-project.org/web/packages/grpreg/grpreg.pdf

Sign In or Register to comment.