Nominal2Binominal is too slow :-(

ruserruser Member Posts: 40 Maven
edited October 2019 in Help
I read the DB contents using DatabaseExampleSource operator (work_on_database is not set), and then I used the Nominal2Binominal operator for converting the nominal attribute values. But, the problem is that Nominal2Binominal runs forever(too slow, running for more than 1h and it still continues).
How do I speed up the process? Will setting create_view to true, help here?
Tagged:

Answers

  • fischerfischer Member Posts: 439 Maven
    Could it be that you are operating on *all* nominal attributes rather than only on your target attribute? If not, please post your process here.

    Cheers,
    Simon
  • ruserruser Member Posts: 40 Maven
    The process definition is as follows:

    <operator name="Root" class="Process" expanded="yes">
        <parameter key="logverbosity" value="status"/>
        <operator name="DatabaseExampleSource" class="DatabaseExampleSource">
            <parameter key="database_url" value="jdbc:mysql://localhost:3306/test?zeroDateTimeBehavior=convertToNull"/>
            <parameter key="username" value="root"/>
            <parameter key="password" value="ZV9SLfsTZEw="/>
            <parameter key="query" value="SELECT * FROM `market_conditions`"/>
            <parameter key="id_attribute" value="id"/>
        </operator>
        <operator name="Nominal2Binominal" class="Nominal2Binominal">
        </operator>
        <operator name="SOMDimensionalityReduction" class="SOMDimensionalityReduction">

        .....definition continues

    </operator>

    Is there something wrong here?
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    the process setup seems to be ok, but if you have a really high number of nominal values per attribute, the binomalization will create a huge amount of attributes. If you don't activate the view, they are all created, calculated and stored in memory during the operators runtime. If you just use them once for the SOM, it would be usefull to create the view. This would have the effect, that the results wouldn't be saved in memory, instead calculated if needed (here one time and hence as often as otherwise). So you would save the time for creating the attributes in main memory.

    Greetings,
      Sebastian

  • ruserruser Member Posts: 40 Maven
    Ok! You are right in my case that some of the nominal attributes take around 50 different values.

    After setting the 'create_view' for the Nominal2Binominal (after that, I have also used Nominal2Numeric with the 'create_view' setting, because the subsequent SOMDimensionalityReduction requires all the values to be numerical), it doesn't hang at it.
    But, the subsequent SOMDimensionalityReduction throws out the 'OutofMemory' error. I tried increasing it in the RapidMinerGUI.bat and rapidminer.bat. But, the same memory error is reported again.

    In fact, I have posted a separate thread for the 'OutofMemory' error. How do I come out of it?

    Thanks!
Sign In or Register to comment.