object to slipt data into trainning and test

imim Member Posts: 12 Contributor II
Hi!
Does RM v 5 has sme object to slipt data into trainning data and into test data using memory (not using disk files).
I would like to open a csv file with a csv object then connect it to a object that splits data, say, into 80% for trainning and 20% for testing and then feed a decision tree with trainning data and  feed a model applier with the testing data, which will apply the decision tree for the testing data.
But i cant find the object that splits the data. Any help would be appreciated.
By the way, is it possible to execute a Rapid Miner v5 project with using the rapid miner GUI environment?

Thanks a lot

IM

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    Of course it is possible to validate a learner's performance this way. RapidMiner is originally designed for exactly this purpose. Just insert the Split Validation operator. It will offer all needed parameters.

    I'm not quite sure if I understand your second question. I guess it's pretty obvious that you only need to press the play button or select the Run menu item to execute a RapidMiner process from within the gui. So I think that's not what you meant. What exactly was you referring to?

    Greetings,
      Sebastian
  • imim Member Posts: 12 Contributor II
    I cant connect the split validation object. I tryed it already.  IT show me erros. I will try again and send you the errors, if you allow.

    The second question is about to execute the project without loading the rapid miner environment. For instance, does RM supports command line execution like weka? Or, is it possible to transform the rapid miner project into an executable that can be executed without rapid miner behind?

    Thanks a  lot

    IM
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    it is possible to execute RapidMiner from command line. Try to use the RapidMiner script appropriate for your os in the scripts directory in the installation directory. Type: RapidMiner <process file>

    Greetings,
      Sebastian
  • imim Member Posts: 12 Contributor II
    but i would like to execute the process but hiding the rapid miner GUI. Is it possible.

    Returning to the splitting of the data to apply the decision tree:
    This works good:

    ReadCSV(out)----(exa)ChangeAttribute(exa)------(tra)DecisionTree(mod)------(mod)ModelApplier(lab)-------------------Res
    ReadCSV2(out)----(exa)ChangeAttribute(exa)----------------------------------------(unl)ModelApplier

    Rem: ReadCSV loads training data(file1). ReadCsv2 loads test data(file2). ModelApplier is a unique object. On the res, i can see data records classified according to the decision tree model

    What i want now to to load just one data file and split in (on-the-fly) into trainning records and test records. So, I added the split validation object, configured with: split=relative, split ratio =0.8, sampling type=shuffled sampling, in a process like this:


    ReadCSV(out)----(tra)SplitValidation(mod)------(exa)ChangeAttribute(exa)-----(tra)DecisionTree(mod)------(mod)ModelApplier(lab)-----------Res
                                      SplitValidation(tra)--------------------------------------------------------------------------------(unl)ModelApplier

    Note: Both SplitValitators and ModelAppliers are unique objects. Now, readcsv just reads the file1 data that i want to split next...

    When i run this process I see the error:
    Process failed.
    Reason: No data was deliverd at port validation.model.

    I cant fix this error. Have i wrong connections between objects?!

    Thank you

    IM                 










  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    if you execute a process from command line, the gui won't be created.

    I would recommend to go through the complete samples coming with rapid miner. They will help to understand key concepts of how to build processes within RapidMiner. On of the key concept are the nested operators. For example the SplitValidation operator is such an operator which can have, and must have, child operators. You can see this on the small icon in the operator box. If you double click on the operator, you will get "into" the operator, where you can add and configure the child operators.
    Please refer to the operator info for more information.

    Greetings,
      Sebastian

    PS: For describing processes to other users here in the forum, it's quite useful to press on the button with the # and copy the complete process describing XML into the code tags. This way, all others can copy it back to RapidMiner and take a fast look at it.
  • imim Member Posts: 12 Contributor II
    >Hi,
    >if you execute a process from command line, the gui won't be created.
    Great!!!

    >I would recommend to go through the complete samples coming with rapid miner. They will help to understand key concepts of how to build >processes within RapidMiner. On of the key concept are the nested operators. For example the SplitValidation operator is such an operator which >can have, and must have, child operators. You can see this on the small icon in the operator box. If you double click on the operator, you will get >"into" the operator, where you can add and configure the child operators.
    >Please refer to the operator info for more information.

    Ok, i did not knew about the dbclick on the validation object. Now i can see its internals. Great!!

    Thanks you again

    IM

    >Greetings,
    >  Sebastian

    >PS: For describing processes to other users here in the forum, it's quite useful to press on the button with the # and copy the complete process >describing XML into the code tags. This way, all others can copy it back to RapidMiner and take a fast look at it.
    Deal!!
Sign In or Register to comment.