"How to evaluate feature weighting ?"

phongphong Member Posts: 1 Contributor I
edited June 2019 in Help
Hi RapidMiner team,

Its Phong from UniGE (e-LICO partner).

I am trying to model a feature weighting (say ReliefF) where I want to do a dimensionality reduction (like top-k), then learn a classification model, and validate it on a test set. All of these should be done in a 10CV.

So, I have tried the standard XValidation operator, modeling in the training phase, the feature weighting + dim. reduction and the classification model, then pass everything to the testing phase, to apply the model and validate it.
However, I encountered a strange thing; after the first fold (ie after the first dim. reduction), the second fold present me a train set, WHICH IS ALREADY REDUCED !! And it continues with the rest of folds... It seems to be a data memory error, or am I wrong ?

Then, I have tried the Wrapper-XValidation operator, which seems to be here especially for feature weighting / selection, but by default, after the attribute weighting phase, when the operator builds the learner, it seems that the training set is reduced automatically by removing features that have zero weight... So how can I specify that I want to apply another rule like top-k ?

Hope that my questions are clear enough..
Thanks for your help.




  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Phong,
    thank you for this detailed description. I will see, what's happening there and write back.

  • Options
    BenBen Member Posts: 13 Contributor II
    Hi all,
    I've been running into the same problem as Phong. For the old RM 4.x I helped myself with rewriting AttributeWeightSelection an cloning the exampleset. (dirty workaround and memory hogging). As I haven't yet ported it to RM 5.0 I wonder if there exists an operator ( / hack) which allows to set all but the top-k weights to zero, so that scoring functions can be used inside a wrapper-xv?

  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    sorry for getting back to this topic after taking a look into this matter. The reason for this behavior have been removed, there was simply a missing clone() call in the validation. So there should be no need for hacking as long as you don't really apply the weighting which would change the underlying data. Before doing this, you must make a fully copy. Unfortunately we didn't manage to include a view concept for the weight calculation, yet, although this should be relatively easy. You might make a feature request for this in the bugtracker.

Sign In or Register to comment.