Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

"automate rapidminer process in java program"

datasunnydatasunny Member Posts: 11 Contributor II
edited June 2019 in Help
Hi all,

I use rapidminer 5.1.011 to design a text classification process, it's working fine in GUI. Now i'm trying to automate this process in java program. I wrote a simple Java program and when i run it i got the following errors:

com.rapidminer.operator.UserError: Could not read file '/home/some_user/wordlist': java.io.IOException: Cannot read from XML stream, wrong format: WordList : WordList.
at com.rapidminer.operator.io.IOObjectReader.read(IOObjectReader.java:100)
at com.rapidminer.operator.io.AbstractReader.doWork(AbstractReader.java:123)
at com.rapidminer.operator.Operator.execute(Operator.java:833)
at com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
at com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:709)
at com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:369)
at com.rapidminer.operator.Operator.execute(Operator.java:833)
at com.rapidminer.Process.run(Process.java:920)
at com.rapidminer.Process.run(Process.java:843)
at com.rapidminer.Process.run(Process.java:802)
at com.rapidminer.Process.run(Process.java:797)
at com.rapidminer.Process.run(Process.java:787)
at ProcessCreator.main(ProcessCreator.java:18)
Caused by: java.io.IOException: Cannot read from XML stream, wrong format: WordList : WordList
at com.rapidminer.tools.XMLSerialization.fromXML(XMLSerialization.java:141)
        ...
Process config XML:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.011">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.1.011" expanded="true" name="Process">
    <process expanded="true" height="695" width="989">
      <operator activated="true" class="read" compatibility="5.1.011" expanded="true" height="60" name="Read" width="90" x="45" y="210">
        <parameter key="object_file" value="/home/some_user/wordlist"/>
        <parameter key="io_object" value="WordList"/>
      </operator>
      <operator activated="true" class="text:process_document_from_file" compatibility="5.1.003" expanded="true" height="76" name="Process Documents from Files" width="90" x="179" y="210">
        <list key="text_directories">
          <parameter key="porn" value="/home/some_user/test_data/porn"/>
        </list>
        <process expanded="true" height="695" width="989">
          <operator activated="true" class="web:extract_html_text_content" compatibility="5.1.004" expanded="true" height="60" name="Extract Content" width="90" x="45" y="30"/>
          <operator activated="true" class="text:transform_cases" compatibility="5.1.003" expanded="true" height="60" name="Transform Cases" width="90" x="180" y="30"/>
          <operator activated="true" class="text:tokenize" compatibility="5.1.003" expanded="true" height="60" name="Tokenize" width="90" x="315" y="30"/>
          <operator activated="true" class="text:filter_stopwords_english" compatibility="5.1.003" expanded="true" height="60" name="Filter Stopwords (English)" width="90" x="450" y="30"/>
          <operator activated="true" class="text:stem_snowball" compatibility="5.1.003" expanded="true" height="60" name="Stem (Snowball)" width="90" x="585" y="30"/>
          <operator activated="true" class="text:filter_by_length" compatibility="5.1.003" expanded="true" height="60" name="Filter Tokens (by Length)" width="90" x="787" y="30">
            <parameter key="min_chars" value="2"/>
            <parameter key="max_chars" value="99"/>
          </operator>
          <connect from_port="document" to_op="Extract Content" to_port="document"/>
          <connect from_op="Extract Content" from_port="document" to_op="Transform Cases" to_port="document"/>
          <connect from_op="Transform Cases" from_port="document" to_op="Tokenize" to_port="document"/>
          <connect from_op="Tokenize" from_port="document" to_op="Filter Stopwords (English)" to_port="document"/>
          <connect from_op="Filter Stopwords (English)" from_port="document" to_op="Stem (Snowball)" to_port="document"/>
          <connect from_op="Stem (Snowball)" from_port="document" to_op="Filter Tokens (by Length)" to_port="document"/>
          <connect from_op="Filter Tokens (by Length)" from_port="document" to_port="document 1"/>
          <portSpacing port="source_document" spacing="0"/>
          <portSpacing port="sink_document 1" spacing="0"/>
          <portSpacing port="sink_document 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="select_attributes" compatibility="5.1.011" expanded="true" height="76" name="Select Attributes" width="90" x="313" y="210">
        <parameter key="attribute_filter_type" value="no_missing_values"/>
        <parameter key="attributes" value="|label|text"/>
      </operator>
      <operator activated="true" class="read_model" compatibility="5.1.011" expanded="true" height="60" name="Read Model" width="90" x="380" y="30">
        <parameter key="model_file" value="/home/some_user/svm_model"/>
      </operator>
      <operator activated="true" class="set_role" compatibility="5.1.011" expanded="true" height="76" name="Set Role" width="90" x="447" y="210">
        <parameter key="name" value="label"/>
        <parameter key="target_role" value="label"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="apply_model" compatibility="5.1.011" expanded="true" height="76" name="Apply Model" width="90" x="581" y="120">
        <list key="application_parameters"/>
      </operator>
      <operator activated="true" class="performance_classification" compatibility="5.1.011" expanded="true" height="76" name="Performance" width="90" x="715" y="120">
        <list key="class_weights"/>
      </operator>
      <connect from_op="Read" from_port="output" to_op="Process Documents from Files" to_port="word list"/>
      <connect from_op="Process Documents from Files" from_port="example set" to_op="Select Attributes" to_port="example set input"/>
      <connect from_op="Select Attributes" from_port="example set output" to_op="Set Role" to_port="example set input"/>
      <connect from_op="Read Model" from_port="output" to_op="Apply Model" to_port="model"/>
      <connect from_op="Set Role" from_port="example set output" to_op="Apply Model" to_port="unlabelled data"/>
      <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
      <connect from_op="Performance" from_port="performance" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
Java code:

import com.rapidminer.tools.OperatorService;
import com.rapidminer.RapidMiner;
import com.rapidminer.Process;
import java.io.*;
import java.io.IOException;

public class ProcessCreator {

    public static void main(String[] argv) {

        try {
            RapidMiner.setExecutionMode(com.rapidminer.RapidMiner.ExecutionMode.EMBEDDED_WITHOUT_UI);
            RapidMiner.init();

            Process process = new Process(new File(argv[0]));

            // perform process
            process.run();
        } catch (Exception e) { e.printStackTrace(); }
    }
}
Appreciated ur help!
Tagged:

Answers

  • datasunnydatasunny Member Posts: 11 Contributor II
    It looks like i have to specify mode as "ExecutionMode.COMMAND_LINE" instead of "ExecutionMode.EMBEDDED_WITHOUT_UI", though i still dont know the difference between these two  ???

    From API doc:
    public static final RapidMiner.ExecutionMode COMMAND_LINE
        RM is executed using RapidMinerCommandLine.main(String[]).

    public static final RapidMiner.ExecutionMode EMBEDDED_WITHOUT_UI
        RM is embedded into another program.

  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,996 RM Engineering
    Hi,

    have a look at the constructor of the ExecutionMode enum:

    private ExecutionMode(boolean isHeadless, boolean canAccessFilesystem, boolean hasMainFrame, boolean loadManagedExtensions) {
    //...
    }
    COMMAND_LINE sets loadManagedExtensions to true, however EMBEDDED_WITHOUT_UI sets it to false.
    WordList is an IOObject from the Text Processing Extension, therefore managed plugins need to be loaded, otherwise a process using anything from these plugins will fail (as yours did).

    Regards,
    Marco
  • nawafpowernawafpower Member Posts: 34 Contributor II
    If I have a model build up using rapidminer and need to check the exact java code for this model, like where is the file that does stemporter in code for example, can I do that?
  • nawafpowernawafpower Member Posts: 34 Contributor II
    Thanks haddock, I did go to the com.rapidminer.operator.text package and it was empty, may be when I did the run configuration steps to make RM runs through eclipse did miss something, actually I have other packages showing as empty, can anybody tell me how to get the contents of these packages? Thanks
  • RapidQuesRapidQues Member Posts: 8 Contributor II
    How did you input the process through command line. Please let me know.
  • RapidQuesRapidQues Member Posts: 8 Contributor II
    Also, what do i input as command line argument to the program?
  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,996 RM Engineering
    Hi,

    for RapidMiner Studio 6, call the rapidminer-batch.bat/rapidminer-batch.sh file and pass the repository location, i.e. "//Local Repository/My folder/my_process". Alternatively you can add "-f" as a parameter followed by a whitespace and then the path to the .rmp file on the harddrive, i.e. -f C:\Users\xyz\.RapidMiner\repositories\Local Repository\Process.rmp

    Regards,
    Marco
  • RapidQuesRapidQues Member Posts: 8 Contributor II
    Thank you :)
Sign In or Register to comment.