RapidMiner 9.7 is Now Available
Lots of amazing new improvements including true version control! Learn more about what's new here.
Feature Extraction and Cross-Validation of an image dataset
I have a dataset consisting of fMRI images. Each image belongs to one class. The dataset is as follows:
Class 1: 9 images
Class 2: 10 images
Class 3: 6 images
Class 4: 12 images
Each image is 4D (time series), i.e. 90x60x10x350 where 350 is the time dimension (i.e. 350 3D volumes). I want to train a classifier on this data.
Now I want to first extract features and then apply feature selection by applying e.g. PCA and then do clustering, like described in the paper "Principal Feature Analysis: A Multivariate Feature Selection Method for fMRI Data" (http://www.hindawi.com/journals/cmmm/2013/645921/). For feature extraction I see the following possibilities:
1. Each voxel is a feature and the average of each voxels time series
is taken. Each image has exactly one feature vector of dimension 90*60*10 = 54'000
2. Each voxel is a feature and each time point (i.e. each 3D volume) is a data point. Each image has 350 feature vectors of dimension 90*60*10 = 54'000 each.
3. Putting all voxels of the whole time series of an image into one feature vector of
size 90*60*10*350 = 18'900'000. Each image has only one feature vector.
4. Take the the correlation value between the voxels as feature values. But this is
computationally not doable.
I'm preferring 2. but I'm not sure if this is a good idea.
How would you do the feature extraction? And how would a correlation based approach in a computational feasible way work?
Last but not least, how would you do cross-validation on the dataset? The problem is that the different classes are imbalanced.
Thank you so much for the answers beforehand.