Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
"automate rapidminer process in java program"
Hi all,
I use rapidminer 5.1.011 to design a text classification process, it's working fine in GUI. Now i'm trying to automate this process in java program. I wrote a simple Java program and when i run it i got the following errors:
I use rapidminer 5.1.011 to design a text classification process, it's working fine in GUI. Now i'm trying to automate this process in java program. I wrote a simple Java program and when i run it i got the following errors:
Process config XML:
com.rapidminer.operator.UserError: Could not read file '/home/some_user/wordlist': java.io.IOException: Cannot read from XML stream, wrong format: WordList : WordList.
at com.rapidminer.operator.io.IOObjectReader.read(IOObjectReader.java:100)
at com.rapidminer.operator.io.AbstractReader.doWork(AbstractReader.java:123)
at com.rapidminer.operator.Operator.execute(Operator.java:833)
at com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
at com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:709)
at com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:369)
at com.rapidminer.operator.Operator.execute(Operator.java:833)
at com.rapidminer.Process.run(Process.java:920)
at com.rapidminer.Process.run(Process.java:843)
at com.rapidminer.Process.run(Process.java:802)
at com.rapidminer.Process.run(Process.java:797)
at com.rapidminer.Process.run(Process.java:787)
at ProcessCreator.main(ProcessCreator.java:18)
Caused by: java.io.IOException: Cannot read from XML stream, wrong format: WordList : WordList
at com.rapidminer.tools.XMLSerialization.fromXML(XMLSerialization.java:141)
...
Java code:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.011">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.011" expanded="true" name="Process">
<process expanded="true" height="695" width="989">
<operator activated="true" class="read" compatibility="5.1.011" expanded="true" height="60" name="Read" width="90" x="45" y="210">
<parameter key="object_file" value="/home/some_user/wordlist"/>
<parameter key="io_object" value="WordList"/>
</operator>
<operator activated="true" class="text:process_document_from_file" compatibility="5.1.003" expanded="true" height="76" name="Process Documents from Files" width="90" x="179" y="210">
<list key="text_directories">
<parameter key="porn" value="/home/some_user/test_data/porn"/>
</list>
<process expanded="true" height="695" width="989">
<operator activated="true" class="web:extract_html_text_content" compatibility="5.1.004" expanded="true" height="60" name="Extract Content" width="90" x="45" y="30"/>
<operator activated="true" class="text:transform_cases" compatibility="5.1.003" expanded="true" height="60" name="Transform Cases" width="90" x="180" y="30"/>
<operator activated="true" class="text:tokenize" compatibility="5.1.003" expanded="true" height="60" name="Tokenize" width="90" x="315" y="30"/>
<operator activated="true" class="text:filter_stopwords_english" compatibility="5.1.003" expanded="true" height="60" name="Filter Stopwords (English)" width="90" x="450" y="30"/>
<operator activated="true" class="text:stem_snowball" compatibility="5.1.003" expanded="true" height="60" name="Stem (Snowball)" width="90" x="585" y="30"/>
<operator activated="true" class="text:filter_by_length" compatibility="5.1.003" expanded="true" height="60" name="Filter Tokens (by Length)" width="90" x="787" y="30">
<parameter key="min_chars" value="2"/>
<parameter key="max_chars" value="99"/>
</operator>
<connect from_port="document" to_op="Extract Content" to_port="document"/>
<connect from_op="Extract Content" from_port="document" to_op="Transform Cases" to_port="document"/>
<connect from_op="Transform Cases" from_port="document" to_op="Tokenize" to_port="document"/>
<connect from_op="Tokenize" from_port="document" to_op="Filter Stopwords (English)" to_port="document"/>
<connect from_op="Filter Stopwords (English)" from_port="document" to_op="Stem (Snowball)" to_port="document"/>
<connect from_op="Stem (Snowball)" from_port="document" to_op="Filter Tokens (by Length)" to_port="document"/>
<connect from_op="Filter Tokens (by Length)" from_port="document" to_port="document 1"/>
<portSpacing port="source_document" spacing="0"/>
<portSpacing port="sink_document 1" spacing="0"/>
<portSpacing port="sink_document 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="select_attributes" compatibility="5.1.011" expanded="true" height="76" name="Select Attributes" width="90" x="313" y="210">
<parameter key="attribute_filter_type" value="no_missing_values"/>
<parameter key="attributes" value="|label|text"/>
</operator>
<operator activated="true" class="read_model" compatibility="5.1.011" expanded="true" height="60" name="Read Model" width="90" x="380" y="30">
<parameter key="model_file" value="/home/some_user/svm_model"/>
</operator>
<operator activated="true" class="set_role" compatibility="5.1.011" expanded="true" height="76" name="Set Role" width="90" x="447" y="210">
<parameter key="name" value="label"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="apply_model" compatibility="5.1.011" expanded="true" height="76" name="Apply Model" width="90" x="581" y="120">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance_classification" compatibility="5.1.011" expanded="true" height="76" name="Performance" width="90" x="715" y="120">
<list key="class_weights"/>
</operator>
<connect from_op="Read" from_port="output" to_op="Process Documents from Files" to_port="word list"/>
<connect from_op="Process Documents from Files" from_port="example set" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Read Model" from_port="output" to_op="Apply Model" to_port="model"/>
<connect from_op="Set Role" from_port="example set output" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Appreciated ur help!
import com.rapidminer.tools.OperatorService;
import com.rapidminer.RapidMiner;
import com.rapidminer.Process;
import java.io.*;
import java.io.IOException;
public class ProcessCreator {
public static void main(String[] argv) {
try {
RapidMiner.setExecutionMode(com.rapidminer.RapidMiner.ExecutionMode.EMBEDDED_WITHOUT_UI);
RapidMiner.init();
Process process = new Process(new File(argv[0]));
// perform process
process.run();
} catch (Exception e) { e.printStackTrace(); }
}
}
0
Answers
From API doc:
public static final RapidMiner.ExecutionMode COMMAND_LINE
RM is executed using RapidMinerCommandLine.main(String[]).
public static final RapidMiner.ExecutionMode EMBEDDED_WITHOUT_UI
RM is embedded into another program.
have a look at the constructor of the ExecutionMode enum: COMMAND_LINE sets loadManagedExtensions to true, however EMBEDDED_WITHOUT_UI sets it to false.
WordList is an IOObject from the Text Processing Extension, therefore managed plugins need to be loaded, otherwise a process using anything from these plugins will fail (as yours did).
Regards,
Marco
Yes, the source is here, in the Plugins directory of the SVN repository..
https://rapidminer.svn.sourceforge.net/svnroot/rapidminer/Plugins/TextProcessing/Vega/src/com/rapidminer/operator/text/io/stemmer/PorterStemming.java
for RapidMiner Studio 6, call the rapidminer-batch.bat/rapidminer-batch.sh file and pass the repository location, i.e. "//Local Repository/My folder/my_process". Alternatively you can add "-f" as a parameter followed by a whitespace and then the path to the .rmp file on the harddrive, i.e. -f C:\Users\xyz\.RapidMiner\repositories\Local Repository\Process.rmp
Regards,
Marco