Options

Extract Metadata /Parameters of a whole Process / an experiment?

Fred12Fred12 Member Posts: 344 Unicorn
edited November 2018 in Help

hi,

 

I often have to try out multiple experiments with different settings in my main process, like the splitting ration of train to testdata, the macros that I set or just what operators are present in my experiment...

Is there any way to take out all the metadata information about one process itself, e.g like splitting ratios and the other things that I mentioned above, so that I have a comparison of my different settings in each experiment, without having to write them down in an excel sheet myself... ? That would really be nice and very useful extension I think for Rapidminer in general...

Best Answer

  • Options
    Fred12Fred12 Member Posts: 344 Unicorn
    Solution Accepted

    I dont know how to parse XML ;)

    Static parameters can be extracted by the Generate User Data operator, whilst dynamic parameters like performance that you get when running the process can only be extracted by the Log operator... I will have to convert both into an example set and join them I somehow I guess... or write them onto a separate excel file and append them after...

Answers

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,507 RM Data Scientist

    Sure,

    Generate Data by User Specification and then create an example set with your settings. This can be written to any format.

     

    ~Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Options
    Fred12Fred12 Member Posts: 344 Unicorn

    ok I will try that out, is there any way to comment or annotate folders (e.g those with experiments) with additional information provided by yourself? I thought this was a function that was present in previous RapidMiner commercial programs..?

     

    Or is it somehow possible to visualise other files, like normal text files in your folders? where you could save all experiment settings and take a quick glimpse onto it?

     

    EDIT: this is really headachingly painful if you have to enter every parameter and its settings manually for each experiment again.. isn't there a way to do this automatically?

     

    I think it would be really useful if programmers would work this out, e.g for each experiment, do an automatic listing of the operators used, with it's settings in a table-like styled format...

     

    this could be  done by an interface method that will be implemented by each operator or in the main menu, that takes the settings from it's implemented operators, I think that would be easy to implement..

  • Options
    Fred12Fred12 Member Posts: 344 Unicorn

    can you explain to me, how I can extract my Parameters settings from the optimize Parameters (GRid) operator, like e.g all the C and gamma values that I chose as combinations? I didnt find how with param("...","...") there seems not to be a field list that accepts those parameter choices to display... ?

  • Options
    yyhuangyyhuang Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 364 RM Data Scientist

    Inside your grid opitimizer, you can add an operator 'Log' after the performance.

    Hope that makes sense.

    Log.png

     

  • Options
    Fred12Fred12 Member Posts: 344 Unicorn

    this whole concept is overly complicated, for some things I have to use generate  data by user specification, and for some I have to use the log operator, however it gives me just the combinations of all tried out C and gammas, but I want the range of C and gamma that I have configured in grid optimizer, to have an overview of how my process is designed .. the "metadata".

    However, to save the results I then have to transform log into an example set and so on -.-

    this is really annoying, I think if there would be some way to integrate process creation metadata with more operators into the process context, or enter the operator data directly into the context for additional operators (like split data, sample... etc.) and save your config as an overview of the whole process, this would make it a lot easier.

     

    I could imagine that other data mining tools are capable to save process metadata configuration as well...

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,507 RM Data Scientist

    Fred,

     

    if it is only the context, you should be fine by parsing the process XML?

     

    ~martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.