Binary image classification - Fine-tuning pretrained models (transfer learning)
I am trying to develop a solution for binary image classification. In my training set, I have two folders, folder 1 including around 1000 images and folder 2 around 400 images (all images look like pac man maze, colored 224*224 pixels). Images in Folder 1 are examples of a specific strategy (called strategy 1), and folder 2 represents a different strategy (called strategy 2). I used image augmentation to generate more images for both folders (used cropping and rotation) and currently folder 1 and 2 contains 1900 images each. For the test set, I randomly picked 20 images from each folder.
I have attached my solution for this binary classification task. As you can see in the process, I import a pre-trained model of Resnet 50 for transfer learning. To fine-tune the model, I freeze up to "avg_pool" and remove "flatten_1" and "fc1000". I added fully connected, dropout, fully connected, dropout, and output layers to retrain the model.
The result that I get (after training 14 hours on a CPU, 16 GB RAM) is the prediction of only one of the classes, for instance, class strategy 1 or class strategy 2 (see attached). My questions:
1) should I change the way I retrain the Resnet 50 (e.g., freezing or retraining more/less layers) so the model can correctly predict the classes? if so, how?
2) when placing the "import existing model" and "fine-tune model" inside a split or cross-validation it gives errors complaining about tensor input. How should I revise the solution to have the model cross-validated?
3) any suggestions on other pre-trained models like VGGs and how to adjust their architecture (freeze, retrain, etc.) to possibly get better results in a shorter time?
PS, I took an image from the puzzle nation blog to roughly show an example of my training images