RapidMiner

RapidMiner

Nominal2Binominal is too slow :-(

Regular Contributor

Nominal2Binominal is too slow :-(

I read the DB contents using DatabaseExampleSource operator (work_on_database is not set), and then I used the Nominal2Binominal operator for converting the nominal attribute values. But, the problem is that Nominal2Binominal runs forever(too slow, running for more than 1h and it still continues).
How do I speed up the process? Will setting create_view to true, help here?
4 REPLIES
Regular Contributor

Re: Nominal2Binominal is too slow :-(

Could it be that you are operating on *all* nominal attributes rather than only on your target attribute? If not, please post your process here.

Cheers,
Simon
Regular Contributor

Re: Nominal2Binominal is too slow :-(

The process definition is as follows:

<operator name="Root" class="Process" expanded="yes">
    <parameter key="logverbosity" value="status"/>
    <operator name="DatabaseExampleSource" class="DatabaseExampleSource">
        <parameter key="database_url" value="jdbc:mysql://localhost:3306/test?zeroDateTimeBehavior=convertToNull"/>
        <parameter key="username" value="root"/>
        <parameter key="password" value="ZV9SLfsTZEw="/>
        <parameter key="query" value="SELECT * FROM `market_conditions`"/>
        <parameter key="id_attribute" value="id"/>
    </operator>
    <operator name="Nominal2Binominal" class="Nominal2Binominal">
    </operator>
    <operator name="SOMDimensionalityReduction" class="SOMDimensionalityReduction">

    .....definition continues

</operator>

Is there something wrong here?
Elite

Re: Nominal2Binominal is too slow :-(

Hi,
the process setup seems to be ok, but if you have a really high number of nominal values per attribute, the binomalization will create a huge amount of attributes. If you don't activate the view, they are all created, calculated and stored in memory during the operators runtime. If you just use them once for the SOM, it would be usefull to create the view. This would have the effect, that the results wouldn't be saved in memory, instead calculated if needed (here one time and hence as often as otherwise). So you would save the time for creating the attributes in main memory.

Greetings,
  Sebastian

Regular Contributor

Re: Nominal2Binominal is too slow :-(

Ok! You are right in my case that some of the nominal attributes take around 50 different values.

After setting the 'create_view' for the Nominal2Binominal (after that, I have also used Nominal2Numeric with the 'create_view' setting, because the subsequent SOMDimensionalityReduction requires all the values to be numerical), it doesn't hang at it.
But, the subsequent SOMDimensionalityReduction throws out the 'OutofMemory' error. I tried increasing it in the RapidMinerGUI.bat and rapidminer.bat. But, the same memory error is reported again.

In fact, I have posted a separate thread for the 'OutofMemory' error. How do I come out of it?

Thanks!