RM Server extensions problem on Azure. Installing extensions on Job Agent?

pari1234pari1234 Member Posts: 26 Maven
edited June 2019 in Help

Hi all,

 

So I spun up a small VM on Azure with RM Server 8.2 (BYOL). And I copied all the extensions I needed from my local RM Studio 8.2 installtions into /opt/rmserver/plugins. Then I restarted the VM to finish installing those extensions as per documentation and loaded a process. I made all the necessary path changes to the retrive and store operators and any other operators that were grabbing a file or data from my local machine. However process execution fails because it still cannot see those extensions and operators such as Free Memory, and t-SNE and it throws the "Dummy Operator" error. Am I missing something? Also how do I restart only rapidminer server and not the whole VM? It's a linux machine I tried sudo /etc/init.d/rmserver restart but all I get is a systemctl prompt and it waits for me to type something..... I don't know what.

 

cat /proc/version says I'm running the following distro on the VM if it matters - 

 

Linux version 3.10.0-862.2.3.el7.x86_64 (builder@kbuilder.dev.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-28) (GCC) ) #1 SMP Wed May 9 18:05:47 UTC 2018

 

There is a note on the documentation page that says the extensions must also be installed on RapidMiner Job Agent. how to do that?

 

Thank you very much.

Best Answer

  • pari1234pari1234 Member Posts: 26 Maven
    Solution Accepted

    @sgenzer @jpuente nevermind, everything is running fine now. It seems it was an issue with one of the output ports of the t-SNE operator, which still exists btw. Basically I had to change my process to only take an output from the exa port of the operator, it is the other out port in that operator that was causing something to blow up in size resulting in the 500: Internal Server Error or the 'User Error'. Maybe I can start a thread about that? Thanks.

Answers

  • pari1234pari1234 Member Posts: 26 Maven

    Okay so I re-read the manual and saw that the plugins also need to be copied into the /opt/rmserver/extensions directory. So I did that and restarted the server. Problem still persists. Thank you

  • pari1234pari1234 Member Posts: 26 Maven

    So turns out in Rapidminer 8.2 things are a little different. The extensions should be copied into /opt/rmserver/plugins. But in order for a process to execute, those extensions must also be available to the Job Agent. This can be done one of two ways -

     

    1. Copy all the extensions to /opt/rmserver/job-agent/extensions. Or,
    2. Edit the config.properties file in /opt/rmserver/job-agent/config and add the line - jobagent.container.extensionsDir = /opt/rmserver/plugins or whatever your plugins directory is and save the file.

    2 is a better method as you don't have to duplicate all your extensions to another directory. But that being said, I don't understand why this simple setting doesn't come pre-configured with the server. Users can go change it later if they like. My process executes completely until the end when the results need to be stored on disk. That's when I get - 500- Internal Server Error.

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    tagging @jpuente

     

     

  • jpuentejpuente Employee, Member Posts: 53 RM Product Management

    Hi,

    Can we see the actual log from the job details page? If it's just the store that fails, we can assume that the extensions have been correctly loaded. Might it be a problem with the path? Is it a relative path?

     

  • pari1234pari1234 Member Posts: 26 Maven

    Hello and thank you @jpuente and @sgenzer.

     

    Attached is a copy-paste of the log taken from executions -> view details on the server. It looks like the extensions issue is solved. I just didn't want to start another thread for the 500 error. I created three folders in root - Data, Processes, and Results, and I'm storing the results under /Results/<project-name>/<result-name>.Thank you.

    log.txt 136.3K
  • jpuentejpuente Employee, Member Posts: 53 RM Product Management

    Hmm. It says it cannot upload the file. Is it very big? Could it fill the filesystem or something like that?

  • pari1234pari1234 Member Posts: 26 Maven

    Thank you @jpuente, I ran the process locally just now and stored them on my disk. Below is a screenshot of the size of the results in my local repository. 

     

    image.png

    And this is a screenshot of my VM disk space -

    image.png

  • pari1234pari1234 Member Posts: 26 Maven

    @jpuente, here's the process and attached is the data. Thank you.

    <?xml version="1.0" encoding="UTF-8"?><process version="8.2.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.2.000" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="8.2.000" expanded="true" height="68" name="Retrieve landscaping tweets 10k" width="90" x="45" y="34">
    <parameter key="repository_entry" value="/Data/landscaping tweets 10k"/>
    </operator>
    <operator activated="true" class="sample" compatibility="8.2.000" expanded="true" height="82" name="Sample" width="90" x="112" y="85">
    <parameter key="sample_size" value="2000"/>
    <list key="sample_size_per_class"/>
    <list key="sample_ratio_per_class"/>
    <list key="sample_probability_per_class"/>
    </operator>
    <operator activated="true" class="text:process_document_from_data" compatibility="8.1.000" expanded="true" height="82" name="Process Documents from Data" width="90" x="246" y="34">
    <parameter key="prune_method" value="absolute"/>
    <parameter key="prune_below_percent" value="50.0"/>
    <parameter key="prune_above_percent" value="90.0"/>
    <parameter key="prune_below_absolute" value="2"/>
    <parameter key="prune_above_absolute" value="9999"/>
    <list key="specify_weights"/>
    <process expanded="true">
    <operator activated="true" class="text:transform_cases" compatibility="8.1.000" expanded="true" height="68" name="Transform Cases" width="90" x="45" y="34"/>
    <operator activated="true" class="text:tokenize" compatibility="8.1.000" expanded="true" height="68" name="Tokenize" width="90" x="179" y="34">
    <parameter key="mode" value="specify characters"/>
    <parameter key="characters" value="' ','.',',','?','!',':',';','&quot;','&lt;','&gt;','&amp;','*','$','(',')','-','+','='"/>
    </operator>
    <operator activated="false" class="text:filter_stopwords_dictionary" compatibility="8.1.000" expanded="true" height="82" name="Filter Stopwords (Dictionary)" width="90" x="313" y="238">
    <parameter key="file" value="C:\Users\Pari\Documents\t-SNE\New Text Document.txt"/>
    </operator>
    <operator activated="true" class="text:filter_stopwords_english" compatibility="8.1.000" expanded="true" height="68" name="Filter Stopwords (English)" width="90" x="514" y="34"/>
    <operator activated="true" class="text:filter_by_length" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (by Length)" width="90" x="648" y="34">
    <parameter key="min_chars" value="3"/>
    </operator>
    <operator activated="false" class="text:generate_n_grams_terms" compatibility="8.1.000" expanded="true" height="68" name="Generate n-Grams (Terms)" width="90" x="581" y="187"/>
    <connect from_port="document" to_op="Transform Cases" to_port="document"/>
    <connect from_op="Transform Cases" from_port="document" to_op="Tokenize" to_port="document"/>
    <connect from_op="Tokenize" from_port="document" to_op="Filter Stopwords (English)" to_port="document"/>
    <connect from_op="Filter Stopwords (English)" from_port="document" to_op="Filter Tokens (by Length)" to_port="document"/>
    <connect from_op="Filter Tokens (by Length)" from_port="document" to_port="document 1"/>
    <portSpacing port="source_document" spacing="0"/>
    <portSpacing port="sink_document 1" spacing="0"/>
    <portSpacing port="sink_document 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="rapidprom:rprom_free_memory" compatibility="4.0.001" expanded="true" height="82" name="Free Memory (RapidProM)" width="90" x="380" y="34"/>
    <operator activated="true" class="select_attributes" compatibility="8.2.000" expanded="true" height="82" name="Select Attributes" width="90" x="514" y="34">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attributes" value="Created-At|From-User|From-User-Id|Geo-Location-Latitude|Geo-Location-Longitude|Language|Retweet-Count|Source|To-User|To-User-Id|text"/>
    <parameter key="invert_selection" value="true"/>
    </operator>
    <operator activated="true" class="transpose" compatibility="8.2.000" expanded="true" height="82" name="Transpose" width="90" x="648" y="34"/>
    <operator activated="true" class="rapidprom:rprom_free_memory" compatibility="4.0.001" expanded="true" height="82" name="Free Memory (2)" width="90" x="782" y="34"/>
    <operator activated="true" class="operator_toolbox:tsne" compatibility="1.2.000" expanded="true" height="82" name="t-SNE" width="90" x="916" y="34">
    <parameter key="include_special_attributes" value="true"/>
    </operator>
    <operator activated="true" class="store" compatibility="8.2.000" expanded="true" height="68" name="Store (3)" width="90" x="1318" y="238">
    <parameter key="repository_entry" value="/Results/t-SNE/t-SNE results"/>
    </operator>
    <operator activated="true" class="concurrency:k_means" compatibility="8.2.000" expanded="true" height="82" name="Clustering" width="90" x="1050" y="34">
    <parameter key="k" value="10"/>
    </operator>
    <operator activated="true" class="model_simulator:cluster_model_visualizer" compatibility="8.2.000" expanded="true" height="82" name="Cluster Model Visualizer" width="90" x="1184" y="34"/>
    <operator activated="true" class="store" compatibility="8.2.000" expanded="true" height="68" name="Store (2)" width="90" x="1318" y="136">
    <parameter key="repository_entry" value="/Results/t-SNE/cluster model"/>
    </operator>
    <operator activated="true" class="store" compatibility="8.2.000" expanded="true" height="68" name="Store" width="90" x="1318" y="34">
    <parameter key="repository_entry" value="/t-SNE/cluster model visualizer"/>
    </operator>
    <connect from_op="Retrieve landscaping tweets 10k" from_port="output" to_op="Sample" to_port="example set input"/>
    <connect from_op="Sample" from_port="example set output" to_op="Process Documents from Data" to_port="example set"/>
    <connect from_op="Process Documents from Data" from_port="example set" to_op="Free Memory (RapidProM)" to_port="through 1"/>
    <connect from_op="Free Memory (RapidProM)" from_port="through 1" to_op="Select Attributes" to_port="example set input"/>
    <connect from_op="Select Attributes" from_port="example set output" to_op="Transpose" to_port="example set input"/>
    <connect from_op="Transpose" from_port="example set output" to_op="Free Memory (2)" to_port="through 1"/>
    <connect from_op="Free Memory (2)" from_port="through 1" to_op="t-SNE" to_port="exa"/>
    <connect from_op="t-SNE" from_port="exa" to_op="Clustering" to_port="example set"/>
    <connect from_op="t-SNE" from_port="out" to_op="Store (3)" to_port="input"/>
    <connect from_op="Store (3)" from_port="through" to_port="result 3"/>
    <connect from_op="Clustering" from_port="cluster model" to_op="Cluster Model Visualizer" to_port="model"/>
    <connect from_op="Clustering" from_port="clustered set" to_op="Cluster Model Visualizer" to_port="clustered data"/>
    <connect from_op="Cluster Model Visualizer" from_port="visualizer output" to_op="Store" to_port="input"/>
    <connect from_op="Cluster Model Visualizer" from_port="model output" to_op="Store (2)" to_port="input"/>
    <connect from_op="Store (2)" from_port="through" to_port="result 2"/>
    <connect from_op="Store" from_port="through" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    <portSpacing port="sink_result 4" spacing="0"/>
    </process>
    </operator>
    </process>

     

  • jpuentejpuente Employee, Member Posts: 53 RM Product Management

    Hi. It doesn't look like the problem is in the process. How big is the VM you're using? What type is it? Would you be able to try it in a different Server?

  • pari1234pari1234 Member Posts: 26 Maven

    @jpuente I posted a snapshot of my VM in the post before the process. It's 2 cores, 8GB RAM, and 32 GB ssd. It's a linux machine. Some of the details are in the previous posts on this thread.

Sign In or Register to comment.