Example Set from Logdata -> saving in repository -> Less examples -> Bug?

Fred12Fred12 Member Posts: 344 Unicorn
edited November 2018 in Help

hi,

 

I have a logdata with parameteroptimizing, I get about 50 combinations in x-validation and therefore in my logdata as examples. Now, if I transform log into Example Set and inspect results at the end, everything is OK, there are 50 examples in the example set.. however, if I store the example set in the repository then, and retrieve it to inspect it, only a part of the examples have been stored!! Like sometimes 17, 19 or 27 examples... I donÄt know why.. I have enough disk space on my PC... thats very weird and especially bad if I want to reconstruct some graphs in the later from my results.. :(

Did anybody encouter the same problem before?

Tagged:

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    I have not. Are you using any of the sorting in the Log operattion? Like Top K sorting? That might cause it skip the order of some examples. 

  • Fred12Fred12 Member Posts: 344 Unicorn

    hm that might be, I ordered by xvalidation performance descending.. and stored it in repository, is that the reason?

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    No, you should be able to store it in a repository using a Log to Data operator. It might be the Top K sort. 

  • Fred12Fred12 Member Posts: 344 Unicorn

    well,

    then it shouldn't be, because I never chose any sorting type option.. it was always 'none'. But it seems to be with the Log to data operator only (or log data only) because example sets for other data like usual excel tabular data example sets are fully saved..

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    I just did a test on my end with 36 rows and saved the log into a repository, all 36 show up. I will test with 56 next. '

     

    Is there some sort of parallelized loop involved in your process?

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    So I just did this with 56 rows and it saved fine but what I noticed is that it saves the repository after each parameter optimization iteration. Maybe there's something broken on your end there? Is the Log to Data inside the Parameter Optimization operator?

  • Fred12Fred12 Member Posts: 344 Unicorn

    actually yes, its inside the parameter optimization operator..

    I noticed at the end when it gives out the results.. its only part of the data, but when I click on show example set result after it finished, it gives the full data.. and when I save it, It also saves the full data... just wondered me that it does not do this in the first place when the process has finished...

     

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Like I said, I don't know your set up there but I've run this multiple times on a different processes and I see all the combinations in the Log File and see them all in a Repository.  I know this would help you with pulling out your hair but I don't think it's a bug. :)

Sign In or Register to comment.