which amazon instance to chose for a "loop in loop" process requiring a huge amount of memory
I have a "loop in loop" process:
- A loop value, with inside a process that loads an example set with 1000 reviews to filter
- the above nested in a loop attribute that loads a a dictionary => dataset of 15 columns that contains all the words to be founded in the reviews. The largest attribute contains 2500 values -rows.
It's impossible to run this process in rapidminer studio that freezes after a while, because of the number of columns that are created by the loop value operator (one column per word for each word of each attribute column of the dictionary: 12660 columns indeed.
I’ve launched first the process in rapidminer AI HUB with an instance r4.xlarge, but crashed, then I tried with a more powerfull one: r4.4xlarge (16 vCPu and 122 GiB memory), but crashed again after few minutes.
Is there a way to define the instance design, in consideration of the number of columns?
thanks in advance for any suggestion