What is the expected accuracy of the CNN tutorial?

Friedemann · April 2021

I have loaded the tutorial process and fed in data from here:

https://www.kaggle.com/scolianni/mnistasjpg?select=trainingSet

(42000 jpeg images in 10 folders)

The process runs fine (approx. 45 Minutes with CPU and about 35 minutes with GPU). If I use the training set as test set as well I reach an accuracy of about 9.97% because every image is classified as a 9. Am I doing something wrong or is there something wrong with the deep learning extension?

Btw, classifying the csv version of the images using two fully connected layers (plus output layer) reaches an accuracy of about 97%.

Update: By using the full data set for both training and testing of the non-CNN net, an accuracy of 99,91% is reached (excution time 7:45 min with GPU).

Cheers

Friedemann

Friedemann · April 2021

Sorry, wrong thread! Please ignore!

Question has been answered in another thread:

https://community.rapidminer.com/discussion/58605/error-when-using-convolutional-layer-message-new-shape-length-doesnt-match-original-length#latest

In a nutshell:

You have to specify the input shape manually! The default is set to "automatic". When switching off automatic mode a number of parameters can be set specifying how to map the data onto the tensor.

Important: This approach assumes that the data is stored as a sequence of rows in the csv and to have a single line per instance. Multi-channel data is represented as a "sequence" of complete instances per channel in the same line and indicated by the "depth" parameter of the input shape.

lionelderkrikor · April 2021

Hi @Friedemann, Hi dear community,

Several weeks ago, I observed something very similar by performing "Time series classification" with the Deep Learning Extension with the LSTM layer inside the Deep Learning (Tensor) operator.
In entry of the process, there is a collection of time series with a label associated to 6 classes.
Like @Friedemann, I'm using the training set as the test set and as the result, the predicted class by the model is systematically one of these classes.

I specify that I performed the same classification task in a Python notebook (using Keras/Tensorflow) and as the result I get around 60% accuracy...!
The process is in attached file (it is basically the same process as the process called "ICU mortality classification" in the samples of the Deep Learning folder).
I can share the data on request if you want to reproduce what I observed with this process in order to understand what is going on.

Regards,

Lionel

pschlunder · April 2021

Hi @Friedemann,

thank you for reaching out. As you said, this sounds not correct.

I've tried to reproduce the error, but I couldn't. When I'm testing the tutorial process from the "Add Convolutional Layer" operator with the trainingSample folder (containing 61 samples per label) from the MNIST jpg export you shared, and use this sample both for training and testing, I'm getting an accuracy of around 88%.

Hence I'd like to learn more about your setup:

Which version both of the Deep Learning and the ND4J Back-End extension are you using?
Can you maybe share the exact process you've used for testing?

Regards,

Philipp

pschlunder · April 2021

Hi @lionelderkrikor,

thank you for reaching out, as well

can you maybe share the data set with me? You've got a PM regarding the data set. I've checked the process and besides the normalization being used to early (which might be a relic from an incorrect old tutorial process) I've not seen any obvious error.

Regards,

Philipp

Friedemann · April 2021

HI Philipp,

I have attached the process I am using - just the tutotrial process with specified data diretories. I am using RM 9.9 rev 0f5626 Platform WIN64 (Full Version info in second attachement).

Deep Learning: 1.1.2
ND4J: 1.0.0
Image Handling: 0.2.1

I am using the the full training set (42.000 images).

Cheers

Friedemann

Image: https://us.v-cdn.net/6030995/uploads/editor/05/8u91b0akuzq5.jpg

Friedemann · April 2021

Update: Tried the tutorial with the training sample and get an accuracy of 45.33% when using the training data as test data - not promising either.

Noticed that RM uses about 19 GB of main memory when starting (blank process). Switching of CUDA does not really change the picture.Is that intended?

Update: I have disabled all extensions and RM still uses almost 19 GB of main memory

Image: https://us.v-cdn.net/6030995/uploads/editor/sy/s069ugorvwld.jpg

Friedemann · April 2021

I guess, that I have found the reason for the problem. By restricting the memory of RM to 12,000 MB and setting the mini batch size to 2,000, the model did now reach an accuracy of 69% (full training set). It seems that it runs out of memory because the JVM allocates almost 19GB in the beginning and the machine has only 32 GB (a few other things are running in parallel). The system monitor now shows a memory utilization of 9.7 GB for the JVM and 8.4 GB for the "Off-Heap memory" (whatever that means). 5.3 GB of GPUI memory is used.

pschlunder · April 2021

The deep learning extension is using out of JVM memory for calculations, that's correct. After setting the JVM memory available for RapidMiner Studio, you can specify the maximum out of JVM via the RapidMiner Studio settings, using the "Backend" tab:

Image: https://us.v-cdn.net/6030995/uploads/editor/kl/tay5bha0t0fb.png

Friedemann · April 2021

Further tests showed a disappointing result. If I decrease the batch size from 2000 or increase the number of epochs the accuracy gets worse and I end up with an accuracy of 10% as before. There are no error messages regarding out of memory conditions or anything. So, to get back to the original question: What would be the expected accuracy of the tutorial process with the MNIST training set. Frankly, I doubt that the CNN layer works correctly.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

What is the expected accuracy of the CNN tutorial?

Best Answer

Answers