Passing a model to XML process in JAVA

awsaws Member Posts: 10 Contributor II
edited November 2018 in Help
Good afternoon,

I would like to write a JAVA program that loads a trained classification model and test data from disk and applies the model to the data. I am almost done; the following code works fine:

import java.io.*;
import java.util.*;

import com.rapidminer.Process;
import com.rapidminer.RapidMiner;
import com.rapidminer.RapidMiner.ExecutionMode;
import com.rapidminer.example.*;
import com.rapidminer.example.set.*;
import com.rapidminer.example.table.*;
import com.rapidminer.operator.*;
import com.rapidminer.tools.*;

public class ApplyModelNew {

public static void main(String[] args) {
try {
// set amount of log messages - not sure about the effect
LogService.getGlobal().setVerbosityLevel(LogService.MINIMUM);

// initialize - possible to switch off certain resources?
RapidMiner.setExecutionMode(ExecutionMode.COMMAND_LINE);
RapidMiner.init();

// class running process gets process as XML file
Process process = new Process(new File("apply_model.rmp"));

// data to predict is written to an ExampleSet (extends IOObject)
// (in createData()) and stored in a list
LinkedList<IOObject> linkedList = new LinkedList<IOObject>();
linkedList.add(createData());

// list is stored in IOContainer and given to starting process
// return value is also an IOContainer
IOContainer resultContainer = process.run(new IOContainer(
linkedList));

// take first IOObject of type SimpleExampleSet from container
// (contains only one element anyway)
// from the example set take first Example (= first row)
Example resultExample = resultContainer.get(SimpleExampleSet.class)
.getExample(0);

// print value that is contained in Example's column "prediction"
System.out.println(resultExample.getPredictedLabel());

} catch (IOException e) {
System.out.println("Error: " + e);
} catch (OperatorException e) {
System.out.println("Error: " + e);
} catch (XMLException e) {
System.out.println("Error: " + e);
}
}
The createData() method reads the test data and returns an ExampleSet. The process I load is just

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.003">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.2.003" expanded="true" name="Process">
    <parameter key="logverbosity" value="init"/>
    <parameter key="random_seed" value="2001"/>
    <parameter key="send_mail" value="never"/>
    <parameter key="notification_email" value=""/>
    <parameter key="process_duration_for_mail" value="30"/>
    <parameter key="encoding" value="SYSTEM"/>
    <process expanded="true" height="676" width="1169">
      <operator activated="true" class="read_model" compatibility="5.2.003" expanded="true" height="60" name="Read Model" width="90" x="246" y="30">
        <parameter key="model_file" value="svm.model"/>
      </operator>
      <operator activated="true" class="apply_model" compatibility="5.2.003" expanded="true" height="76" name="Apply Model" width="90" x="447" y="165">
        <list key="application_parameters"/>
        <parameter key="create_view" value="false"/>
      </operator>
      <connect from_port="input 1" to_op="Apply Model" to_port="unlabelled data"/>
      <connect from_op="Read Model" from_port="output" to_op="Apply Model" to_port="model"/>
      <connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="source_input 2" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
However, as you may have noticed, my model is hardwired in the XML file. I would like to change this. In particular, my new process looks as follows:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.017">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.1.017" expanded="true" name="Process">
    <parameter key="logverbosity" value="init"/>
    <parameter key="random_seed" value="2001"/>
    <parameter key="send_mail" value="never"/>
    <parameter key="notification_email" value=""/>
    <parameter key="process_duration_for_mail" value="30"/>
    <parameter key="encoding" value="SYSTEM"/>
    <parameter key="parallelize_main_process" value="false"/>
    <process expanded="true" height="676" width="1169">
      <operator activated="true" class="apply_model" compatibility="5.1.017" expanded="true" height="76" name="Apply Model" width="90" x="514" y="30">
        <list key="application_parameters"/>
        <parameter key="create_view" value="false"/>
      </operator>
      <connect from_port="input 1" to_op="Apply Model" to_port="model"/>
      <connect from_port="input 2" to_op="Apply Model" to_port="unlabelled data"/>
      <connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="source_input 2" spacing="0"/>
      <portSpacing port="source_input 3" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
I am wondering how to change the JAVA program to pass the model to the process. Here's my best guess:

import java.io.*;
import java.util.*;

import com.rapidminer.Process;
import com.rapidminer.RapidMiner;
import com.rapidminer.RapidMiner.ExecutionMode;
import com.rapidminer.example.*;
import com.rapidminer.example.set.*;
import com.rapidminer.example.table.*;
import com.rapidminer.operator.*;
import com.rapidminer.operator.io.ModelLoader;
import com.rapidminer.tools.*;

public class ApplyModelNew2 {

public static void main(String[] args) {
try {
// set amount of log messages - not sure about the effect
LogService.getGlobal().setVerbosityLevel(LogService.MINIMUM);

// initialize - possible to switch off certain resources?
RapidMiner.setExecutionMode(ExecutionMode.COMMAND_LINE);
RapidMiner.init();

--> // Initialize a ModelLoader
ModelLoader ml = new ModelLoader(new OperatorDescription(???));

// class running process gets process as XML file
Process process = new Process(new File("apply_model2.rmp"));

// data to predict is written to an ExampleSet (extends IOObject)
// (in createData()) and stored in a list
LinkedList<IOObject> linkedList = new LinkedList<IOObject>();
--> linkedList.add(ml.read()); // give model to process
linkedList.add(createData());

// list is stored in IOContainer and given to starting process
// return value is also an IOContainer
IOContainer resultContainer = process.run(new IOContainer(
linkedList));

// take first IOObject of type SimpleExampleSet from container
// (contains only one element anyway)
// from the example set take first Example (= first row)
Example resultExample = resultContainer.get(SimpleExampleSet.class)
.getExample(0);

// print value that is contained in Example's column "prediction"
System.out.println(resultExample.getPredictedLabel());

} catch (IOException e) {
System.out.println("Error: " + e);
} catch (OperatorException e) {
System.out.println("Error: " + e);
} catch (XMLException e) {
System.out.println("Error: " + e);
}
}
}
The execution fails due to the correct initialization of the OperatorDescriptor. From a logical point of view, I would expect that the ModelLoader needs the path and file name of my stored model, but I just don't understand how to pass this information correctly.

I am looking foward to your answers.

Greetings,
  Alex
Tagged:

Answers

  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,993 RM Engineering
    Hi,

    a model is just an IOObject. So don't use anything special (aka ModelLoader), just retrieve the model aka IOObject from the repository, and deliver it to the process.

    Regards,
    Marco
Sign In or Register to comment.