RapidMiner

Best Practices for Folder Structures in Repositories

by Moderator on ‎10-26-2016 08:53 AM - edited on ‎10-26-2016 09:56 AM by Community Manager

RapidMiner Repositories give you the option to store anything in folders. Here is a ‘best practice’ on how to organize the folders to make them easier to use.

 

There should be one folder per project.  This can be either at the top-level of your Local Repository or in a projects folder on the top level of a Server repository. Our proposed folder structure would be:

 

  • app
    • View 1
    • View 2
  • data
  • debug
  • models
  • processes
    • subprocesses
  • results
  • webservices

Note: Italic folders are not mandatory

 

app

The app folder contains all processes related to an app. In larger processes it makes sense to use subfolders for each View on the app – View 1, View 2, above. Only the global processes (like !Initialize) would be on the top level.

 

data

Simply contains all data used in the analysis.

 

debug

From time to time it is needed to have debug data - mostly to test things during the design of the process. A common example would be a data base sample which might be used instead of the real, full database.

 

processes

This is the main folder holding all processes of your analysis. It often makes sense to create a subprocess folder which contains function-like processes which are used throughout the main processes via Execute Process.

 

results / models

The results folder contains all results of the modelling process. Usually there are performance and models. In the case of multiple models - either because there are many different types of models you want to try, or because you want to predict many labels - it makes sense to have a dedicated folder for each model.

 

webservices

contains all processes which are used to offer a webservice. In rare cases a subprocess folder might be of use.