RapidMiner

Integrating RapidMiner with Java Application

Contributor II nelze
Contributor II

Integrating RapidMiner with Java Application

I'm trying to integrate my rapidminer process into a java application. I need to dynamically input the test data, but the training set is already in the rapidminer process. (It's my first time trying to integrate rapidminer so I need some help). But when I run the code, it keeps saying this error:

[Fatal Error] :1:1: Premature end of file.
Exception in thread "main" java.lang.NullPointerException
   at NaiveClassifier.main(NaiveClassifier.java:34)

Line 33 and 34 is
Operator op = process.getOperator("Read Excel");
op.setParameter(CSVExampleSource.PARAMETER_FILENAME, myData);

where myData is just the filepath of the test data. I used CSV because my file is a .csv file. I'm not sure if I used it correctly. I also can't seem to find a list of the getoperator parameters so I left it at the default "Read Excel".

When I run the process in rapidminer (with me manually putting the test data in the process), it gives out the classification result.

My test data is unlabeled, so I don't know if that's the issue why it's causing the error. Also, the port in rapidminer is connected to the input port at the left side, since I assume that's what I need to do if I need to input test data dynamically. My test data is also comprised of many rows, so rapidminer needs to give me like an array of classification results.

I used the code posted in this forum sometime ago, the one which the admin posted.

Any help will be appreciated.
27 REPLIES
RM Staff
RM Staff

Re: Integrating RapidMiner with Java Application

Hi,

a couple of hints:

1) when you connect the process input port on the left side of your process to an operator, you can supply input IOObjects when calling process.run(new IOContainer(...))
2) process.getOperator("operator_name"); gets an operator from the process where the name matches the name displayed in RapidMiner. If your operator is called "Read Excel" in RapidMiner GUI, you can get it this way.
3) Only set parameters from the matching operator implementation class. If you are using an Excel operator, ExcelExampleSource.PARAMETER_XYZ is valid, CSVExampleSource.PARAMETER_XYZ is not.

Regards,
Marco
_________________________________________________________
Team Lead Software Engineering | RapidMiner GmbH
Contributor II nelze
Contributor II

Re: Integrating RapidMiner with Java Application

Thank you for the reply!

1) what is the parameter inside IOContainer in process.run(new IOContainer(...))? Is it the test data? When I tried to put the test data in it it says premature end of file then no absolute path.

I used the Read CSV so I changed it to that. This is what the process looks like. All settings are in default except for the separator which I changed to comma.



RM Staff
RM Staff

Re: Integrating RapidMiner with Java Application

Hi,

RapidMiner processes work with IOObjects. This is the interface for the data coming in and out of operator ports. In your specific case, you can remove the connection from the process input port to the Read CSV operator, because the Read CSV operator is capable of reading a .csv file directly from the file system.

Regards,
Marco
_________________________________________________________
Team Lead Software Engineering | RapidMiner GmbH
Contributor II nelze
Contributor II

Re: Integrating RapidMiner with Java Application

This is what the java code looks like

public class NaiveClassifier {

public static void main (String args[])
{

   try {
     RapidMiner.setExecutionMode(RapidMiner.ExecutionMode.COMMAND_LINE);
     RapidMiner.init();

     Process process = new Process(new File("C:\\Users\\nelze\\.RapidMiner5\\repositories\\WalkingRepo\\NaiveClassify.rmp"));
     Operator op = process.getOperator("Read CSV");
                    op.setParameter(CSVExampleSource.PARAMETER_FILENAME, "C:\\Users\\nelze\\unlabeled.csv");
     RepositoryLocation loc = new RepositoryLocation("C:\\Users\\nelze\\unlabeled.csv");
IOObjectEntry entry = (IOObjectEntry) loc.locateEntry();
IOObject myIOObject = entry.retrieveData(null);
IOContainer ioInput = new IOContainer(new IOObject[] {myIOObject});
IOContainer ioResult = process.run(ioInput);
     ExampleSet resultSet1 = (ExampleSet)ioResult.getElementAt(0);
     System.out.println(resultSet1);

   } catch (IOException | XMLException | OperatorException ex) {
     ex.printStackTrace();
   } catch (RepositoryException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
 
}
}


I'm not sure if what I put in Repository is correct (I just hardcoded the filepath of the test data for now)
Contributor II nelze
Contributor II

Re: Integrating RapidMiner with Java Application

So in the rapidminer process, I will remove the line going from inp to read csv completely?
RM Staff
RM Staff

Re: Integrating RapidMiner with Java Application

Hi,

no, remove the connection between the process input port and the Read CSV operator, leave the others as they are. Also your code will not work because you do this:
RepositoryLocation loc = new RepositoryLocation("C:\\Users\\nelze\\unlabeled.csv");
which is not possible. A RepositoryLocation is a location in the RapidMiner repository, not a file on your harddisk. You need to configure your "Read CSV" properly so that after setting the com.rapidminer.operator.nio.CSVExampleSource.PARAMETER_CSV_FILE to the desired .csv file on your harddisk the process works.

In a simplified way you can say the following: "Read xyz" operators read data from an outside source (harddisk, database, web, ...) into the internal format (IOObject) used by RapidMiner. Usually this will be an ExampleSet, at least for data which can be represented in a table-like structure. Most operators then work with this internal data representation. At the end of the process you either save the results inside a repository (if you want to reuse the results inside another RapidMiner process, or you will make use of a "Write xyz" operator which will transform the data from the interal representation back into something of the outside worl (e.g. store it on the harddisk, in a database, ...)

Edit: updated to correct operator and parameter.

Regards,
Marco
_________________________________________________________
Team Lead Software Engineering | RapidMiner GmbH
Contributor II nelze
Contributor II

Re: Integrating RapidMiner with Java Application

Hi, I apologize for the many questions but I am quite a newbie to this..

1. So my process now looks like this, is this correct?


2. I changed RepositoryLocation loc= new RepositoryLocation("//WalkingRepo//");
But then it says that IOObject is null Smiley Sad WalkingRepo is where my .rmp file is, same with the training data and I also put the test data in it for good measure.

3. How do I configure the Read CSV? The only setting I found there for the csv file (in rapidminer, the settings on the right tab) is where it makes me select a file on the disk. However, only the sample file is in the disk, but in reality I need to get the file dynamically (like the user inputs the directory in their own filepath), so I'm not sure how to configure Read CSV in this way Smiley Sad I only put the harddrive file location as a placeholder in the code.

4. I need to store the results of the classification to the database, but I was just planning on getting the ExampleSet result straight then put it to the web since I don't need a copy of it locally. But if I do need to store the results, what's the sample code for that?

Thank you!
Contributor II nelze
Contributor II

Re: Integrating RapidMiner with Java Application

I edited my code to this

RapidMiner.setExecutionMode(RapidMiner.ExecutionMode.COMMAND_LINE);
     RapidMiner.init();

     Process process = new Process(new File("C:\\Users\\nelze\\.RapidMiner5\\repositories\\WalkingRepo\\NaiveClassify.rmp"));
     Operator op = process.getOperator("Read CSV");
     op.setParameter(CSVExampleSource.PARAMETER_FILENAME, "C:\\Users\\nelze\\unlabeled.csv");
     IOContainer ioResult = process.run();
     ExampleSet resultSet1 = (ExampleSet)ioResult.getElementAt(0);
     System.out.println(resultSet1);



And it finally printed out the ExampleSet, but all it said was the attributes. How do I obtain the classification results? My unlabeled .csv file has multiple rows, so I was expecting to get an array of the prediction per row, like how it lists it in the "prediction" column when I run it in rapidminer.
RM Staff
RM Staff

Re: Integrating RapidMiner with Java Application

Hi,

when you Syso an ExampleSet, you only get whatever the toString() method returns. An ExampleSet is basically a table - you can call


ExampleSet exampleSet = null;
Iterator<Attribute> allAttributes = exampleSet.getAttributes().allAttributes();
while (allAttributes.hasNext()) {
Attribute att = allAttributes.next();
for (Example example : exampleSet) {
if (att.isNominal()) {
System.out.println(example.getNominalValue(att));
} else if (att.isDateTime()) {
System.out.println(example.getDateValue(att));
} else {
System.out.println(example.getValue(att));
}
}
}


to iterate over the whole table structure.

Regards,
Marco
_________________________________________________________
Team Lead Software Engineering | RapidMiner GmbH