Options

"Subprocess operator error"

LarryLarry Member Posts: 3 Contributor I
edited May 2019 in Help
I have been using RapidMiner for data mining and SVM classification model development with very good results. I have a multi-core processor and have been able to do many X-Validation operations with multiple threads. However I am encoutering a problem with the Subprocess operator. There are 4 Execute Process operators withen my Subprocess operator, and the Subprocess executes successfully every time if the "parallelize nested chain property" is unchecked. But when checked, Subprocess execution sometimes (but not always) generates the error "process xxxx does not exist", with xxxx referencing a process that is withen one of the 4 Execute Process operators. Any help with this would be appreciated.
Tagged:

Answers

  • Options
    haddockhaddock Member Posts: 849 Maven
    Hi Larry,

    I found your description rather difficult to understand, it would help if you posted the XML instead.
  • Options
    LarryLarry Member Posts: 3 Contributor I
    Haddock,

    Thanks for your reply. I am reluctant to post the code since it is for a financial information system and is proprietary in nature. However I can provide more info. The Subprocess operator has withen it 4 Execute Process operators. Each of the 4 Execute Process operators has a chain of over 100 execute processes . Each process in the chain refers to an actual process, e.g. "GenerateDataXYZ" or "ApplyModelXYZ". The error message is "error accessing repository data" and "process GenerateDataXYZ does not exist", where "GenerateDataXYZ" is a process that access the data store.

    I know that the problem is not with the Subprocess operator specifically since it also may occur if I include the 4 Execute Process operators withen a parent Execute Process operator and choose "parallelize main process". The error never occurs if the parallelize option is unchecked. Interestingly, the error occurs only occasionally, as in 3 times out of 100 or so executions. When it occurs, it is always at the start of the execution (the first process in a chain of over 100). The first process in each chain is always a "GenerateData" process. However there are many "GenerateData" processes in each chain that cause no error. Could the error be caused by 2 different "GenerateData" processes simultaneously accessing the data store? I believe my next attempt to fix this problem will be to create multiple data stores, 1 for each of the 4 Execute Processes that I want to run in parallel.
  • Options
    LarryLarry Member Posts: 3 Contributor I
    PROBLEM SOLVED by adding a Delay operator at the start of each chain of processes. Each of the 4 Delay operators is set to a different value, e.g. 1000 ms, 2000 ms, etc... to insure that the processes running in parallel start at different times. Now there are no more errors.
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    this seems to be an bug in the implementation. Could you please send the error message as a bug report?

    Greetings,
      Sebastian
Sign In or Register to comment.