Options

How to insert and export strings

maxfaxmaxfax Member Posts: 17 Contributor II
edited November 2018 in Help
Hi I would like to run an rapidminerprocess which does basic text normalization (tokenizing etc.) . Is it possible to run my Process created in Rapidminer.gui with my string from my java programm and can i return the normalized string into my java programm after the process is finished. I know that it is possible to read and write files with rapidminer and than to read those in java, but this is not a secure and safe way.

Thank you very much for you help i hope you understand my problem .

Tagged:

Answers

  • Options
    Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,995 RM Engineering
    Hi,

    in the sticky thread here you can see how to execute a RapidMiner process via Java. You can use that and once you have created your process, you can set operator parameters - for example if you use the Create Document operator, you can set its text parameter via

    operator.setParameter(SingleDocumentInputOperator.PARAMETER_TEXT, "your text");
    and then just execute the process as described in the link above. You can setup your process in such a way that it returns the document as output. Then you get a com.rapidminer.operator.text.Document IOObject as a result, and you can get the results and process them in any way you want.

    Regards,
    Marco
  • Options
    maxfaxmaxfax Member Posts: 17 Contributor II
    Hi Thank you very much for your reply . I managed to create and load the Process but not the Output or the Input. Do i only have to copy and paste your Code to my Programm ? Am i Missing a PlugIn or do i have to import another package. I think eclipse does not know the SingleDocumentInputOperator

    Or is it not Possible to change an Operator after reading an xml-file to create the Process, do i have to create it by hand ?

    My Code so far after the init is the following.
    	


                           Process rm5 = new Process();
    rm5 = new Process(new File(directory + "/src/Tokenizer.xml"));
    Operator.setParameter(SingleDocumentInputOperator.PARAMETER_TEXT, "your text");

    System.out.println("Init Fertitsch");

    rm5.run();
    I have imported the following packages:
    Do i miss some ? For example the text plugin ? And how do i get or install it ?
    import com.rapidminer.Process;
    import com.rapidminer.RapidMiner;
    import com.rapidminer.RapidMiner.ExecutionMode;
    import com.rapidminer.operator.OperatorException;
    import com.rapidminer.repository.MalformedRepositoryLocationException;
    import com.rapidminer.tools.OperatorService;
    import com.rapidminer.tools.XMLException;
    import com.rapidminer.operator.Operator;



    Which parameters do i have to set in rapidminer gui? and afterwars in Java ? To Get an Output ?

    thank you very much for your Help !

    My XML Data right now is.
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.008">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
       <parameter key="logverbosity" value="all"/>
       <parameter key="random_seed" value="2001"/>
       <parameter key="send_mail" value="never"/>
       <parameter key="notification_email" value=""/>
       <parameter key="process_duration_for_mail" value="30"/>
       <parameter key="encoding" value="SYSTEM"/>
       <parameter key="parallelize_main_process" value="false"/>
       <process expanded="true" height="415" width="835">
         <operator activated="true" class="text:create_document" compatibility="5.2.004" expanded="true" height="60" name="Create Document" width="90" x="23" y="44">
           <parameter key="text" value="Hi das ist der Prozess den ich ausführen möchte!"/>
           <parameter key="add label" value="false"/>
           <parameter key="label_type" value="nominal"/>
         </operator>
         <operator activated="true" class="text:tokenize" compatibility="5.2.004" expanded="true" height="60" name="Tokenize" width="90" x="180" y="30">
           <parameter key="mode" value="non letters"/>
           <parameter key="characters" value=".:"/>
           <parameter key="language" value="English"/>
           <parameter key="max_token_length" value="3"/>
         </operator>
         <operator activated="true" class="text:transform_cases" compatibility="5.2.004" expanded="true" height="60" name="Transform Cases" width="90" x="315" y="30">
           <parameter key="transform_to" value="lower case"/>
         </operator>
         <operator activated="true" class="text:filter_stopwords_german" compatibility="5.2.004" expanded="true" height="60" name="Filter Stopwords (German)" width="90" x="450" y="30">
           <parameter key="stop_word_list" value="Standard"/>
         </operator>
         <operator activated="true" class="text:stem_german" compatibility="5.2.004" expanded="true" height="60" name="Stem (German)" width="90" x="45" y="165"/>
         <operator activated="true" class="text:filter_by_length" compatibility="5.2.004" expanded="true" height="60" name="Filter Tokens (by Length)" width="90" x="313" y="165">
           <parameter key="min_chars" value="2"/>
           <parameter key="max_chars" value="25"/>
         </operator>
         <connect from_op="Create Document" from_port="output" to_op="Tokenize" to_port="document"/>
         <connect from_op="Tokenize" from_port="document" to_op="Transform Cases" to_port="document"/>
         <connect from_op="Transform Cases" from_port="document" to_op="Filter Stopwords (German)" to_port="document"/>
         <connect from_op="Filter Stopwords (German)" from_port="document" to_op="Stem (German)" to_port="document"/>
         <connect from_op="Stem (German)" from_port="document" to_op="Filter Tokens (by Length)" to_port="document"/>
         <connect from_op="Filter Tokens (by Length)" from_port="document" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="108"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
    </process>
  • Options
    Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,995 RM Engineering
    Hi,

    SingleDocumentInputOperator is from the Text Mining Extension, so you will need to add that as a library to your project. To control process input/output, you need to connect input and output ports in the RapidMiner GUI to your operators. Input port(s) are on the left of the process design panel, output ports on the right side. Each connected input port will use the given argument you hand to the process, and each output port will produce an IOObject in the result array.

    Regards,
    Marco
  • Options
    maxfaxmaxfax Member Posts: 17 Contributor II
    This is  my code - I imported the text-processing successfully but now i always get an error :



    What is my mistake ? The Argument is not set in my Program. Instead the Programm only prints out the default text i inserted in the create Document Operator. And not the OPerator i created in my Java programm ?

    What did I do wrong ?


    		
    Process rm5 = new Process();
    //System.out.println("Init Finished");


    try {
        // create operator
        Operator inputOperator = OperatorService.createOperator(SingleDocumentInputOperator.class);

           
        // set parameters
        inputOperator.setParameter(SingleDocumentInputOperator.PARAMETER_TEXT, "your text  das ist der tolle text der uns jetzt gerade interessiertetn tut");
       
        // add operator to process
        // add other operators and set parameters
        // [...]
    } catch (Exception e) { e.printStackTrace(); }
       
    rm5 = new Process(new File(directory + "/src/Tokenizer.xml"));


    // just use myProcess.run() if you don't use the input ports for your process
    IOContainer ioResult = rm5.run();
       
    System.out.println(ioResult);
    Thank you very much for your Help !! It is really appreciated !
  • Options
    Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,995 RM Engineering
    Hi,

    see here under the point "Question: I want to create my own process and execute it via java. What is the best way to do this?"
    You can basically copy&paste the code there and then use it. No need to create operators via OperatorService etc as that is a very error-prone way.
    Once you have your process, you can then for example modify the text paramter of your "Create Document" operator via

    process.getOperator("Create Document").setParameter(SingleDocumentInputOperator.PARAMETER_TEXT, "your Text");
    After that, you can execute your process and it will work on your own text you provided at runtime.

    Regards,
    Marco
Sign In or Register to comment.