Deeplearning4J: Process Failed Indexes must be same length as array rank

brandjoe · May 2019

Below the error I'm getting trying to train my idea (?) of a ConvLSTM Artificial Neural Net. The error pops up once training of the model reaches 100% and before the model can be applied to the testing set.

Process and a portion of the sample data are attached to this post. The config of the sample data is attached as well (screenshot). Thinking about it, I could have included that as an operator into the process... sorry about that.

Any ideas how I could get past this error?

Exception: java.lang.IllegalArgumentException
Message: Indexes must be same length as array rank
Stack trace:

  org.nd4j.linalg.api.shape.Shape.getOffset(Shape.java:646)
  org.nd4j.linalg.api.shape.Shape.getDouble(Shape.java:510)
  org.nd4j.linalg.api.ndarray.BaseNDArray.getDouble(BaseNDArray.java:1804)
  org.nd4j.linalg.api.ndarray.BaseNDArray.getDouble(BaseNDArray.java:4209)
  com.rapidminer.extension.deeplearning.ioobjects.DeepLearningModel.performPrediction(DeepLearningModel.java:170)
  com.rapidminer.operator.learner.PredictionModel.apply(PredictionModel.java:116)
  com.rapidminer.operator.ModelApplier.doWork(ModelApplier.java:134)
  com.rapidminer.operator.Operator.execute(Operator.java:1013)
  com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:77)
  com.rapidminer.operator.ExecutionUnit$2.run(ExecutionUnit.java:812)
  com.rapidminer.operator.ExecutionUnit$2.run(ExecutionUnit.java:807)
  java.security.AccessController.doPrivileged(Native Method)
  com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:807)
  com.rapidminer.extension.concurrency.operator.process_control.loops.AbstractLoopOperator.doIteration(AbstractLoopOperator.java:409)
  com.rapidminer.extension.concurrency.operator.process_control.loops.AbstractLoopOperator.performSynchronizedLoop(AbstractLoopOperator.java:382)
  com.rapidminer.extension.concurrency.operator.process_control.loops.AbstractLoopOperator.doWork(AbstractLoopOperator.java:462)
  com.rapidminer.operator.Operator.execute(Operator.java:1013)
  com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:77)
  com.rapidminer.operator.ExecutionUnit$2.run(ExecutionUnit.java:812)
  com.rapidminer.operator.ExecutionUnit$2.run(ExecutionUnit.java:807)
  java.security.AccessController.doPrivileged(Native Method)
  com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:807)
  com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:423)
  com.rapidminer.operator.Operator.execute(Operator.java:1013)
  com.rapidminer.Process.executeRoot(Process.java:1377)
  com.rapidminer.Process.execute(Process.java:1318)
  com.rapidminer.Process.run(Process.java:1291)
  com.rapidminer.Process.run(Process.java:1177)
  com.rapidminer.Process.run(Process.java:1130)
  com.rapidminer.Process.run(Process.java:1125)
  com.rapidminer.Process.run(Process.java:1115)
  com.rapidminer.gui.ProcessThread.run(ProcessThread.java:65)<br>

varunm1 · May 2019

Hello @brandjoe

I see that the issue is coming at the apply model operator as the input indices of data are not matching with model shaoes. I am a bit curious if the present deep learning operator is good to build a mixed network like yours (CNN+LSTM) as this needs fine grain coding in DL4J.

@hughesfleming68 any thoughts on mixed model with deep learning extension?

@pschlunder can suggest more.

Thanks

hughesfleming68 · May 2019

The error is coming from the LSTM layer. Let me see what can be done. To be honest, I don't think it should be done this way. A better choice would be to add a second conv layer + pooling if it needs more complexity. I will take a look more closely tomorrow.

Another issue is that there really isn't enough data to make it worth while to use deep learners. This is my personal view.

brandjoe · May 2019

@varunm1 many thanks for chiming in. Same to @hughesfleming68 of course.

Regarding your observation @hughesfleming68 you seem correct. Conducting further tests with only LSTM layer(s) alone, I couldn't get LSTM to work at all.

Regarding data, my real dataset has 12'700 examples with 15 attributes but is of exactly the same structure. It is sensitive data but can be shared in private.
The 12'700 lines are daily sales, 6 shops, 3 product categories each from April 2017 to April 2019. I can get more (~200 shops, ~50 product categories, ~4-5 years back, daily) but preparing the data is difficult, as many attributes have to be added manually.

My final objective is two-fold. Statement: "I, using machine learning/deep learning, ..."

A. arrive in somewhat accurately predicting expected sales (for next period, eg. next week), per shop, per product category feeding the learner with historical sales, weather plus context data per specific shop
B. once A is achieved, try to prove that removing vital attributes which classify a specific shop (proximity to lake (See) or university (Hochschule), together with weather figures and data about lectures taking place etc.) has measurable impact in prediction accuracy for shops affected by that attribute.

My history why I'm trying the above:

I initially went for random forest which put very high emphasis on the shop itself and thus didn't produce "generic" models applicable to data containing unseen shops, albeit they share the same attributes as shops included in the original training data.
I then went for gradient boosted trees, which worked to some degree but don't remember why I gave up on GBT.
I then wanted to do time series analysis, but my data apparently isn't suitable and would need serious restructuring and transformation.
I stumbled upon the idea to use ANN/deep learning for my problem; from my research reading blogs et al. it was suggested to use LSTM, but LSTM alone would perform poorly for such a problem. The solution being a combined ConvLSTM which should outperform anything.

So it was an evolutionary process how I arrived at what I am currently trying to do. Any input is highly appreciated.

hughesfleming68 · May 2019

Hi @brandjoe, looking more closely, I would start with linear models and focus on feature selection. I don't think that you will get satisfactory results trying to approach this problem from the start with either CNN's or LSTMs or some combination of the two.

regards,

Alex

brandjoe · May 2019

Thank you for your comment.
The way I understand linear regression works, I am not sure this will serve my ultimate goal. As a newbie I stand corrected though.

The products I have selected are alcohol, sausages and crisps/chips, bakery and drinks/softdrinks.
This because when setting off with the project I assumed the former few to correlate with seasonality, weather and lakes on weekends and during holidays; the latter few with universities and semesters, and holidays of course but also less with seasonality.

Of the shops I have selected, 2 are "university" of which one is also "lake", some are "lake", some are neither. Those in big cities show high turnover, others naturally a lot less.

When plotting and shuffling the data around (Excel Pivot Chart) some of the assumptions can indeed be easily confirmed. Some seem a lot less obvious though.

I hoped for deep learning to pick up those factors and differences, something regression models won't ever do. Am I wrong?

I am doing this part for university, part as a pet project for my employer.
If my work confirms the assumptions, my company will do a proper project on this, with real data scientists and scale of course, and try to operationalize machine learning/deep learning to predict very short-term sales (eg. next 2-3 days). But predictions should not base on history, we can do that today already with standard SAP, but the predictions should be based on the various input factors for the whole shop network, to reduce food waste.

If the sun isn't out this year but always was last years, too much will be ordered and go bad in the shops. Seeing beef filet getting thrown away in kilos because our SAP just isn't good enough hurts my head and heart. You still feel "pure regression" is the way to go?

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Deeplearning4J: Process Failed Indexes must be same length as array rank

Answers

Be Safe. Follow precautions and Maintain Social Distancing