Options

transform data for rapidMiner - inference stage

chenUser4321chenUser4321 Member Posts: 11 Contributor II
edited November 2018 in Help
I am using RM 4.6 to transform data for RM using my own application data directly as input.

Figure 7.4 (shown below) in rapidMiner-4.6-tutorial.pdf gives an example of classifier training and inference. For learning
Model model = learner. learn (exampleSet);
uses an ExampleSet object as input, section 7.6 tells how to do the data transformation for this object class. However, for inference,
container = modelApp.apply(container);
uses an IOContainer object as input. But I cannot find any source from the tutorial that tells how to transform data to this object class.

So how can we do the data transform for the object class in inference stage?

By the way, besides the code below from Figure 7.4 that tells how to train and test a classifier, is there any other methods and sample code that do this?

Any comment is greatly appreciated!
	public static void main(String [] args) {
try {
RapidMiner. init ();
// learn
Operator exampleSource =
OperatorService . createOperator(ExampleSource.class);
exampleSource.setParameter(” attributes ”,
”/path/to/your/training data .xml”);
IOContainer container = exampleSource.apply(new IOContainer());
ExampleSet exampleSet = container.get(ExampleSet.class);
// here the string based creation must be used since the J48 operator
// do not have an own class ( derived from the Weka library ).
Learner learner = (Learner)OperatorService . createOperator(”J48”);
Model model = learner. learn (exampleSet);
// loading the test set ( plus adding the model to result container )
Operator testSource =
OperatorService . createOperator(ExampleSource.class);
testSource .setParameter(” attributes ”, ”/path/to/your/test data .xml”);method and
container = testSource.apply(new IOContainer());
container = container.append(model);
// applying the model
Operator modelApp =
OperatorService . createOperator(ModelApplier. class );
container = modelApp.apply(container);
// print results
ExampleSet resultSet = container. get(ExampleSet.class );
Attribute predictedLabel = resultSet . getPredictedLabel ();
ExampleReader reader = resultSet.getExampleReader();
while (reader .hasNext()) {
System.out. println (reader .next (). getValueAsString( predictedLabel ));
}
} catch (IOException e) {
System.err . println (”Cannot initialize RapidMiner:” + e.getMessage());
} catch (OperatorCreationException e) {
System.err . println (”Cannot create operator:” + e.getMessage());
} catch (OperatorException e) {
System.err . println (”Cannot create model: ” + e.getMessage());
}
}
Tagged:

Answers

  • Options
    fischerfischer Member Posts: 439 Maven
    Hi,

    you can simply create a new IOContainer() and add your model to it. It's that simple.

    Best,
    Simon
  • Options
    chenUser4321chenUser4321 Member Posts: 11 Contributor II
    Thanks for the reply!

    However, how to transform data from my application to an IOContainer?

    In the sample code,

    Operator testSource =
    OperatorService . createOperator(ExampleSource.class);
    testSource .setParameter(” attributes ”, ”/path/to/your/test data .xml”);
    container = testSource.apply(new IOContainer());
    container = container.append(model);
    a testSource with test data file path is specified to create the IOContainer. Correspondingly, the data from an application is also needed to transform to an IOContainer object before the model is applied.

    So how to create such IOContainer? Should we create an ExampleSet and then transform it to be an IOContainer? If so, what is the proper way to do this? I do not see an obvious way according to the sample code.

    Any information is sincerely appreciated!
  • Options
    fischerfischer Member Posts: 439 Maven
    Hi,

    sorry, I'm not sure I understand what you are asking. You don't transform IOObjects (like ExampleSets) into IOContainers, you create an IOContainer and append the IOObject to it. It is a container.

    If you are asking how you can convert your own Java data structure into an ExampleSet, then the answer is: Create Attributes, make a MemoryExampleTable from them, and start populating it with DataRows. Finally, use one of the createExampleSet-methods of the ExampleTable to make your ExampleSet.

    Best,
    Simon
  • Options
    chenUser4321chenUser4321 Member Posts: 11 Contributor II
    I have some basic idea to do it. Thanks for the reply!
  • Options
    chenUser4321chenUser4321 Member Posts: 11 Contributor II
    Thank Simon for the reply.

    I have a new question.

    For RM 4.6, is it possible for training features (not the actual label) to have non-numerical value, such as nominal value?

    If so, when doing the data transform from applications, do we just need to map the non-numerical value to a numerical value, such as that done for the classification label in Figure 7.4 in rapidMiner-4.6-tutorial.pdf?

    Any comment is appreciated!
  • Options
    chenUser4321chenUser4321 Member Posts: 11 Contributor II
    I realize that in section 5.4 in rapidMiner-4.6-tutorial.pdf, it gives the learner capabilities. These contain the attribute types, such as polynominal attributes, binominal attributes, and numerical attributes. Different learners can have different capacities to support the attribute types. This seems answer my question to some degree.
  • Options
    fischerfischer Member Posts: 439 Maven
    Hi,

    nominal attributes have an internal mapping from nominal values to indices (getNominalMapping()). When you build up a DataRow for use in an ExampleTable, you can use this mapping to generate the indices you need. When you set values in an Example, the setValue(Attribute,String) uses this mapping automatically.

    Best,
    Simon
  • Options
    chenUser4321chenUser4321 Member Posts: 11 Contributor II
    Thank you Simon! This is very useful information.

    Daozheng
Sign In or Register to comment.