🦉 🎤   RapidMiner Wisdom 2020 - CALL FOR SPEAKERS   🦉 🎤
We are inviting all community members to submit proposals to speak at Wisdom 2020 in Boston.
Whether it's a cool RapidMiner trick or a use case implementation, we want to see what you have.
Form link is below and deadline for submissions is November 15. See you in Boston!
Thoughts about memory consumption and FeatureSelection...
I'm running the 32 bit Version of RapidMiner 4.6 and try to do a forward feature selection on a data set with 100 examples and 2000 features. After 5 hours RapidMiner used 1.4 GB RAM and finished with an Out of Memory error :-(
Searching the forum I found several posts dealing with memory consumption and that it might be a bad idea to do feature selection on such a large data set. Then I tried to do a rough calculation of the necessary memory:
100 examples * 2000 features * 8 byte = 1.6 MB
For the first generation the FeatureSelection algorithm will create 2000 individuals making this 3.2 GB, so no wonder that I run out of memory.
But then I realized that this is true for a backward feature selection, but not for a forward feature selection !
Forward selection starts with a single attribute, so all the individuals of the first generation only need
100 examples * 1 feature * 8 byte * 2000 individuals = 1.6 MB !!
So, now I'm back to square one. Why is forward feature selection needing so much memory ??
My only guess is that, although not necessary, the individuals do nevertheless get a full copy of the data set !?
If this is true, the code urgently needs a revision.
Maybe someone can comment on this ?