"Bug in Saving Performance Vector"

PinguiculaPinguicula Member Posts: 12 Contributor II
edited May 2019 in Help
Hi altogether,

I work in an WindowsXp environment and try to evaluate the performance of a tree classifer using xvalidation.
Actually everything runs as long as I don't want to save the performance vector. Than the Rapidminer crashes and refuses to continue to work. It happens also with some of the provided samples.
However, the files seem to get saved propoerly.

Is there a way to fix it and if how?

Best

Norbert

Answers

  • steffensteffen Member Posts: 347 Maven
    Hello Norbert

    From this point of view it is quite hard to analyse the error. Do you can post some more details please ?
    Here are some hints:
    • Switch to the XML-Tab and copy the content to the forum to gain the process setup
    • go to header->Tools->Preferences->and activate rapidminer.general.debugmode. Then run the process again, which lead to a more detailed error message you can send as bug report or post here,too
    greetings

    Steffen
  • TobiasMalbrechtTobiasMalbrecht Moderator, Employee, Member Posts: 294 RM Product Management
    Hi Norbert,

    in addition to Steffens remarks I would like to ask which RapidMiner version do you use? As far as I remember, quite a lot data had been saved in a performance vector in older releases which resulted in a large runtime or other inconvenient behaviour. It might therefore be possible, that an update to the newest RapidMiner version solves your problem, if you do not already updated RapidMiner.

    Regards,
    Tobias
  • PinguiculaPinguicula Member Posts: 12 Contributor II
    Hi Steffen and Tobias,

    Thank you for the quick response
    attached you find the layout of the process:
    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSource" class="ExampleSource" breakpoints="after">
            <parameter key="attributes" value="D:\ZID_daten\weka\GemBRD\EqSizeBin\NormVar\BRD_NormGemClusterOutL1.aml"/>
            <parameter key="sample_ratio" value="0.1"/>
        </operator>
        <operator name="MultipleLabelIterator" class="MultipleLabelIterator" expanded="yes">
            <operator name="XValidation" class="XValidation" expanded="yes">
                <parameter key="create_complete_model" value="true"/>
                <parameter key="sampling_type" value="shuffled sampling"/>
                <operator name="DecisionTree" class="DecisionTree">
                </operator>
                <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                    <operator name="ModelApplier" class="ModelApplier">
                        <list key="application_parameters">
                        </list>
                    </operator>
                    <operator name="Performance" class="Performance">
                    </operator>
                </operator>
            </operator>
        </operator>
        <operator name="AverageBuilder" class="AverageBuilder">
        </operator>
    </operator>
    > in addition to Steffens remarks I would like to ask which RapidMiner version do you use?
    I use the Version 4.1 however the performance vector has a size of 6 MB which looks rather huge to me compared to the amount of data available in the output (confusion matrix, kappa- statistics and overall accurancy)
    > go to header->Tools->Preferences->and activate rapidminer.general.debugmode. Then run the process again, which lead to a more detailed error message you can send as bug report or post here,too
    I did as proposed. RM still refuse to transmit a single bit of data after the "save" button of perfomance vector has been pushed. Therefore I can give you a more detailed bug report.

    Best Norbert
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hello,

    I would like to suggest that you try out the latest version 4.2 which is available on our web site now:

    http://rapid-i.com


    The written performance vectors are much smaller now and as far as I remember the writing mode was also changed. So probably this problem is no longer there in the latest version (at least I cannot reproduce it).

    Cheers,
    Ingo
  • PinguiculaPinguicula Member Posts: 12 Contributor II
    Hi,

    I downloaded 4.2 and tested it on several of our computers. But the problem is at least partially still present.
    If I save the Performance vector manually RI gets stuck in an endless loop and the resulting .per file has a size of 6 MB.
    If I use the respective IO container and integrate the saving process in the program everything runs smoothly and the .per file has a size of 23KB.

    Perhaps this information will help you to find the bug.

    Best,

    Norbert

  • TobiasMalbrechtTobiasMalbrecht Moderator, Employee, Member Posts: 294 RM Product Management
    Hi Norbert,

    if I understand you right, the behaviour is correct when you save the performance vector via the [tt]IOContainerWriter[/tt] but not when saving the performance vector manually by clicking on the button in the GUI (when the performance is shown)? What happens when you save the performance vector by the [tt]PerformanceWriter[/tt] operator?

    Normally, at least the last to ways should lead to the same files with the same sizes ... if they do not, this is indeed a bug and we will try to fix this as soon as possible.

    Regards,
    Tobias
  • PinguiculaPinguicula Member Posts: 12 Contributor II
    Hi Tobias,

    Sorry for using the wrong terminology. But essentially you got my point.
    Everything is fine when I use the PerformanceWriter but when I try to save manually via the GUI the system crashes.

    Norbert
  • TobiasMalbrechtTobiasMalbrecht Moderator, Employee, Member Posts: 294 RM Product Management
    Hi Norbert,

    no need to apologize, I just wanted to check if I understood you right!  ;)
    We will have a look at the problem and post again when we solved the problem. Until then please use the workaround by saving the performance vector using the [tt]PerformanceWriter[/tt].

    Regards,
    Tobias
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    we have tried it with both possibilities: using the PerformanceWriter operator and pressing the "Save..." button in the results view. It worked perfectly well for both cases. Does the problem also occurs if you de-activate "use_example_weights" in the performance operator?

    Cheers,
    Ingo
  • PinguiculaPinguicula Member Posts: 12 Contributor II
    Hi Ingo,

    Even after switching off the "use_example_weights" Rapidminer still behaves uncooperatively -it still craches -if I try to save the performance vector manually.

    Best

    Norbert


  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    hhmm, I am simply not able to reproduce this error. Could have something to do with you data although I doubt it. May I ask if you could provide us a small data sample leading to the crash when the Performance is saved via "Save..."? Of course only if it is not too sensible. You could attach the data here or send it to "request@rapid-i.com". Please provide also details about you operating system and version etc.

    Thanks a lot.

    Cheers,
    Ingo
Sign In or Register to comment.